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1.  Introduction 


In  retrospect,  it  is  clear  that  Algol  60  [2,  3]  was  an  heroic  and  surprisingly  successful  attempt 
to  design  a  programming  language  from  first  principles.  Its  creation  gave  a  formidable 
impetus  to  the  development  and  use  of  theory  in  language  design  and  implementation, 
which  has  borne  rich  fruit  in  the  intervening  thirty-six  years.  Most  of  this  work  has  led  to 
languages  that  are  quite  different  than  Algol  60,  but  there  has  been  a  continuing  thread  of 
concern  with  languages  that  retain  the  essential  character  of  the  original  language  [4,  5].  We 
feel  that  research  in  this  direction  has  reached  the  point  where  it  is  desirable  to  design  a 
modern  Algol-like  language  that  is  as  uniform  and  general  as  possible. 

This  is  the  goal  of  the  programming  language  Forsythe.  We  believe  that  it  retains  the 
essence  of  Algol  60,  yet  is  both  simpler  and  more  general.  The  key  to  achieving  this  combina¬ 
tion  of  simplicity  and  generality  is  to  exploit  the  procedure  mechanism  and  the  type  system, 
in  order  to  replace  a  multitude  of  specialized  features  by  a  few  general  constructions. 

The  language  is  named  after  George  E.  Forsythe,  founding  chairman  of  the  Computer 
Science  Department  at  Stanford  University.  Among  his  many  accomplishments,  he  played  a 
major  role  in  familiarizing  American  computer  scientists  (including  the  author)  with  Algol. 

Before  considering  Forsythe  in  detail,  we  specify  its  location  in  the  design  space  of  pro¬ 
gramming  languages.  As  illustrated  below,  Forsythe  lies  on  one  side  of  each  of  three  funda¬ 
mental  dichotomies  in  language  design: 
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First,  it  is  a  typed  language.  It  has  long  been  understood  that  imposing  a  type  discipline  can 
yield  major  improvements  in  compile-time  error  detection  and  in  the  efficiency  of  run-time 
data  representations.  However,  type  systems  that  are  flexible  enough  to  support  sophisti¬ 
cated  programming  techniques  are  a  much  more  recent  development. 

Second,  Forsythe  has  imperative  features  (i.e.  assignment  and  control  flow)  as  well  as 
a  powerful  procedure  mechanism.  Like  all  such  languages,  it  suffers  from  the  problems  of 
aliasing  and  interference.  However,  we  believe  that  imperative  programming  is  a  fundamental 
paradigm  that  should  not  be  ignored  in  programming  language  design. 

Finally,  Forsythe  uses  call  by  name  rather  than  call  by  value.  For  purely  functional 
languages  this  is  merely  a  distinction  between  orders  of  evaluation,  but  for  languages  with 
imperative  features  it  is  a  fundamental  dichotomy  in  the  way  that  the  imperative  and  func¬ 
tional  aspects  are  linked;  one  is  tempted  to  speak  of  Algol-like  versus  ISWIM-like  languages. 

In  any  event,  the  following  basic  operational  view,  which  is  implicit  in  Algol  60,  underlies 
Forsythe  and  distinguishes  it  from  such  languages  as  ISWIM  [6],  Algol  68  [7],  Scheme  [8], 
and  ML  [9]:  The  programming  language  is  a  typed  lambda  calculus  with  a  primitive  type 
comm(and),  such  that  terms  of  this  type,  when  reduced  to  normal  form,  are  commands  in 
the  simple  imperative  language.  Thus  a  program,  which  must  be  a  term  of  type  comm, 
is  executed  in  two  phases.  First  the  program  is  reduced  to  normal  form.  (In  Algol  jargon, 
the  copy  rule  is  repeatedly  applied  to  eliminate  procedure  calls.)  Then  the  resulting  simple 
imperative  program  is  executed: 

Reduction  of  lambda  expressions  (copy  rule) 

normal  form 

Execution  of  commands  in  the  simple  imperative  language 

The  only  complication  is  that,  in  the  presence  of  recursion,  the  reduction  phase  may  go  on 
forever,  producing  an  infinite  or  partial  “normal  form”.  Nevertheless,  such  an  infinite  term 
can  still  be  viewed  as  a  simple  imperative  program;  operationally,  one  simply  implements 
the  two  phases  as  coroutines. 

Even  in  this  more  general  situation,  the  above  diagram  still  describes  an  essential  re¬ 
striction  on  the  flow  of  information:  Nothing  that  happens  in  the  second  phase  ever  affects 
anything  that  happens  in  the  first  phase.  Thus  Forsythe  inherits  the  basic  property  of  the 
lambda  calculus  that  meaning  does  not  depend  upon  the  order  or  timing  of  reductions. 
Indeed,  reduction  rules  can  be  viewed  as  equations  satisfied  by  the  language. 

In  contrast,  consider  the  situation  in  an  ISWIM-like  language  such  as  Scheme  or  ML  that 
provides  assignable  function  variables.  If  /  is  such  a  variable,  then  the  effect  of  reducing 
/(•••)  will  depend  upon  when  the  reduction  occurs  relative  to  the  sequence  of  assignments 
to  /  that  are  executed  in  the  imperative  phase.  In  this  situation,  the  procedure  mechanism 
is  completely  stripped  of  its  functional  character. 
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2.  From  Algol  to  Forsythe:  An  Evolution  of  Types 


The  long  evolution  which  has  led  from  Algol  60  to  Forsythe  is  too  complex  to  recount  here 
in  detail.  However,  to  provide  an  overview  of  Forsythe  and  reveal  its  relationship  to  Algol,  it 
is  useful  to  outline  the  development  of  the  heart  of  the  language,  which  is  its  type  structure. 
(In  this  introductory  account,  we  retain  the  familiar  notations  of  Algol,  rather  than  using 
the  novel  notations  of  Forsythe.) 

An  essential  characteristic  of  an  Algol-like  language  is  that  the  variety  of  entities  that  can 
be  the  value  of  a  variable  or  expression  is  different  from  the  variety  of  entities  that  can  be 
the  meaning  of  identifiers  or  phrases.  We  capture  this  characteristic  by  distinguishing  two 
kinds  of  type  (as  in  [5]  and  [10]): 

•  A  data  type  denotes  a  set  of  values  appropriate  to  a  variable  or  expression. 

•  A  phrase  type,  or  more  simply  a  type,  denotes  a  set  of  meanings  appropriate  to  an 
identifier  or  phrase. 

In  Algol  60,  there  are  three  data  types:  integer,  real,  and  boolean.  In  Forsythe,  we  use 
more  succinct  names,  int,  real,  and  bool,  and  add  a  fourth  data  type,  char,  denoting  the 
set  of  machine-representable  characters. 

To  capture  the  existence  of  an  implicit  conversion  from  integers  to  reals,  we  define  a 
partial  order  on  data  types  called  the  subtype  relation.  We  write  6  <  8' ,  and  say  that  ^  is  a 
subtype  of  8'  when  either  8  =  8'  ov  8  =  int  and  8'  =  real,  i.e. 

real 

bool  char 
int 


In  Algol  60,  the  phrase  types  are  the  entities,  such  as  integer,  real  array,  and  proce¬ 
dure,  that  are  used  to  specify  procedure  parameters.  However,  the  phrase  types  of  Algol  60 
are  not  sufficiently  refined  to  permit  a  compiler  to  detect  all  type  errors.  For  example,  in 
both 

procedure  silly[x)-,  integer  x;  y  :=  x 


and 


procedure  strange{x)-,  integer  x;  x  :=  x  -|- 1 


the  formal  parameter  x  is  given  the  type  integer,  despite  the  fact  that  an  actual  parameter 
for  silly  can  be  any  integer  expression,  since  x  is  evaluated  but  never  assigned  to,  while  an 
actual  parameter  for  strange  must  be  an  integer  variable,  since  x  is  assigned  to  as  well  as 
evaluated. 
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To  remedy  this  defect,  one  must  distinguish  the  phrase  types  int(eger)  exp(ression)  and 
int(eger)  var(iable),  writing 

procedure  silly{x)-^  intexp  x\  y  :=  x 

and 

procedure  strange{x)\  intvar  a:;  x  a;  +  1  . 

(In  a  similar  manner,  each  of  the  other  data  types  gives  rise  to  both  a  phrase  type  of 
expressions  and  a  phrase  type  of  variables.) 

Like  data  types,  phrase  types  possess  a  subtype  relationship.  Semantically,  9  <  O'  means 
that  there  is  an  implicit  conversion  from  meanings  of  type  0  to  meanings  of  type  O' .  But  the 
subtype  relation  can  also  be  interpreted  syntactically:  0  <  O'  means  that  a  phrase  of  type  0 
can  be  used  in  any  context  requiring  a  phrase  of  type  O'.  Thus,  since  a  variable  can  be  used 
as  an  expression,  intvar  <  intexp,  and  similarly  for  the  other  data  types.  Moreover,  since 
an  integer  expression  can  be  used  as  a  real  expression,  intexp  <  realexp.  In  summary: 

charexp 
charvar 


realexp 


realvar  intexp 


intvar 


boolexp 


boolvar 


However,  there  is  an  unpleasant  asymmetry  here.  It  can  be  remedied  by  distinguishing, 
in  addition  to  expressions  which  can  be  evaluated  but  not  assigned  to,  acceptors  which  can 
be  assigned  to  but  not  evaluated.  Then,  for  example,  we  can  write 


procedure  peculiar{x);  intacc  x]  x  :=  0 
to  indicate  that  peculiar  assigns  to  its  parameter  but  never  evaluates  it. 

Clearly,  intvar  <  intacc,  and  similarly  for  the  other  data  types.  Moreover,  realacc  < 
intacc,  since  an  acceptor  that  can  accept  any  real  number  can  accept  any  integer.  Thus  the 
subtype  relation  is 

intacc  realexp 


realacc  ^  intexp 


realvar 


intvar 


boolacc 

boolexp 

characc 

charexp 

\ 

/ 

\ 

/ 

boolvar 

charvar 

However,  there  is  a  further  problem.  In  Forsythe,  the  conditional  construction  is  gen¬ 
eralized  from  expressions  and  commands  to  arbitrary  phrase  types;  in  particular  one  can 
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construct  conditional  variables.  Thus  if  p  is  a  boolean  expression,  n  is  an  integer  variable, 
and  a;  is  a  real  variable,  one  can  write 

if  p  then  n  else  x 

on  either  side  of  an  assignment  command.  But  when  this  construction  occurs  on  the  right  of 
an  assignment,  it  must  be  regarded  as  a  real  expression,  since  p  might  be  false,  while  when 
it  occurs  on  the  left  of  an  assignment,  it  must  be  regarded  as  an  integer  acceptor,  since  p 
might  be  true.  Thus  the  construction  is  an  int(eger  accepting),  real  (producing)  var(iable), 
which  fits  into  the  subtype  relation  as  follows: 

intacc  realexp 


realacc  (int,  real)  var  intexp 


realvar  intvar 

Next,  we  consider  the  types  of  procedures.  In  Algol  60,  when  a  parameter  is  a  proce¬ 
dure,  one  simply  specifies  procedure  for  a  proper  procedure  (whose  call  is  a  command), 
or  integer  procedure,  real  procedure,  or  boolean  procedure  for  a  function  procedure 
(whose  call  is  an  expression).  But  to  obtain  full  compile-time  typechecking,  one  must  use 
more  refined  phrase  types  that  indicate  the  number  and  type  of  parameters,  e.g. 

procedure(intexp,  intvar) 

to  denote  a  proper  procedure  accepting  an  integer  expression  and  an  integer  variable,  or 

real  procedure(realexp) 

to  denote  a  real  procedure  accepting  a  real  expression.  (Note  that  this  refinement  introduces 
an  infinite  number  of  phrase  types.) 

These  constructions  can  be  simplified  and  generalized  by  introducing  a  binary  type  con¬ 
structor  — >  such  that  6^9'  denotes  the  type  of  procedures  that  accept  9  and  produce  9'  or, 
more  precisely,  the  type  of  procedures  that  accept  a  single  parameter  of  type  9  and  whose 
calls  are  phrases  of  type  9'.  For  example,  a  real  procedure  accepting  a  real  expression  would 
have  type  realexp  — ^  realexp. 

To  describe  proper  procedures  similarly,  it  is  necessary  to  introduce  the  type  comm 
to  describe  phrases  that  are  commands  (or  in  Algol  jargon,  statements).  Then  a  proper 
procedure  accepting  an  integer  variable  would  have  type  intvar  comm. 

The  idea  that  commands  are  not  a  kind  of  expression  is  one  of  the  things  that  distinguishes 
Algol- like  languages  from  languages  such  as  Scheme  or  ML,  where  commands  are  simply 
expressions  that  produce  trivial  values  while  performing  side  effects.  This  distinction  is  even 
sharper  for  Forsythe,  where  expressions  cannot  have  side  effects. 
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To  extend  the  typing  of  procedures  to  permit  more  than  one  parameter,  one  might  in¬ 
troduce  a  type  constructor  for  products  and  regard,  say,  procedure(intexp,  intvar)  as 
(intexp  X  intvar)  — >  comm.  However,  as  we  will  see  below,  the  product-like  construction 
in  Forsythe  describes  objects  whose  fields  are  selected  by  names  rather  than  position.  Thus 
multiple-parameter  procedures  are  more  easily  obtained  by  Currying  rather  than  by  the  use 
of  products. 

For  example,  procedure(intexp,  intvar)  becomes  intexp  (intvar  — >  comm)  or, 
more  simply,  intexp  — >  intvar  comm,  since  is  right  associative.  In  other  words,  a 
proper  procedure  accepting  an  integer  expression  and  an  integer  variable  is  really  a  procedure 
accepting  an  integer  expression  whose  calls  are  procedures  accepting  an  integer  variable 
whose  calls  are  commands.  Thus  the  call  p(ai,a2)  is  written  (p(ai))(a2)  or,  more  simply, 
p{ai){a2)i  since  procedure  application  is  left  associative.  (In  fact,  if  the  parameters  are 
identifiers  or  constants,  one  can  simply  write  pa^  02.) 


In  general,  the  type 
becomes 


procedure(0i, . . . ,  0„) 


di 


On  comm , 


and,  for  each  data  type  6,  the  type 


6  procedure(0i, ...,6n) 


becomes 

6^  ^  On  6exp  . 


Moreover,  this  generalization  includes  the  special  case  where  n  =  0,  so  that  parameterless 
proper  procedures  are  simply  commands  and  parameterless  function  procedures  are  simply 
expressions.  (Note  that  this  simplification  is  permissible  for  call  by  name,  but  would  not  be 
for  call  by  value,  where  parameterless  procedures  are  needed  —  as  in  LISP  —  to  postpone 
evaluation.) 

To  determine  the  subtype  relation  for  procedural  types,  suppose  <  0i  and  O2  <  O2. 
Then  a  procedure  of  type  61  — >  62  can  accept  a  parameter  of  type  9[  (since  this  parameter 
can  be  converted  to  type  ^i)  and  its  call  can  have  type  O'^  (since  it  can  be  converted  from  62 
to  62),  so  that  the  procedure  also  has  type  ^  ^2-  Thus 


If  6[  <  0\  and  62  <  ^2  then  ^1  ^  ^2  ^  ^  ^2  > 


i.e.  — >•  is  antimonotone  in  its  first  operand  and  monotone  in  its  second  operand. 
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For  example,  since  intexp  <  realexp,  we  have 

intexp  — realexp 


intexp  intexp 


realexp  — >  realexp 


realexp  — intexp 

We  have  already  seen  that  comm(and)  must  be  introduced  as  a  primitive  phrase  type. 
It  is  also  useful  to  introduce  a  subtype  of  comm  called  compl(etion): 

comm 


compl 

Essentially,  a  completion  is  a  special  type  of  command,  such  as  a  goto  command,  that  never 
returns  control. 

The  advantage  of  distinguishing  completions  is  that  control  structure  can  be  made  more 
evident.  For  example,  in 

procedure  sqroot{x,y,  error);  intexp  a: ;  intvar  ?/;  compl  error; 
begin  if  x  <  0  then  error;  C  end  , 

specifying  error  to  be  a  completion  makes  it  evident  that  C  will  never  be  executed  when 

X  <  0. 

As  mentioned  earlier,  Forsythe  has  a  type  constructor  for  named  products.  The  basic 
idea  is  that  the  phrase  type 

(<•1 , ,  iji.  Ofi) 

is  possessed  by  objects  with  fields  named  by  the  distinct  identifiers  ii, . . . ,  i„,  in  which  the 
field  named  ik  has  type  0k-  Note  that  the  meaning  of  this  phrase  type  is  independent  of  the 
order  of  the  tk-  0k  pairs.  We  use  the  term  “object”  rather  than  “record”  since  fields  need 
not  be  variables.  For  example,  one  could  have  a  field  of  type  intvar  — >  comm  that  could 
be  called  as  a  proper  procedure,  but  not  assigned  to.  (Roughly  speaking,  objects  are  more 
like  class  members  in  Simula  67  [11]  than  like  records  in  Algol  W  [4].) 

Clearly,  the  product  constructor  should  be  monotone: 

If  n  >  0  and  0i  <  0[  and  . . .  and  0n  <  0^  then 

(<■1 . 01 ,  .  .  .  ,  <-,j.  0n)  ^  (^l*01j'--5^n-0n)  • 

In  fact,  a  richer  subtype  relationship  is  desirable,  in  which  objects  can  be  converted  by 
“forgetting”  fields,  so  that  an  object  can  be  used  in  a  context  requiring  a  subset  of  its 
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fields.  This  relationship  (which  is  closely  related  to  “multiple  inheritance”  in  object-oriented 
programming  [12])  is  expressed  by 


If  n  >  m  >  0  and  9i  <  9[  and  . . .  and  9m  <  9'^  then 

(ij;  01,  .  .  .  ,  in’  0n)  ^  (^1-  ^15  •  •  •  5  I'm-  • 


For  example, 


partno :  int  cost :  real 


(partno:  int,  cost:  real) 


{partno:  intvar,  cost:  realvar) 

At  this  point,  we  have  summarized  the  type  structure  of  Forsythe  (then  called  “Idealized 
Algol”)  as  it  appeared  in  about  1981  [5].  Since  then,  the  language  has  been  generalized,  and 
considerably  simplified,  by  the  introduction  of  intersection  types  [13,  14,  15], 

(At  the  outset,  a  caution  must  be  sounded  that  this  use  of  the  word  “intersection”  can 
be  misleading.  If  one  thinks  of  types  as  standing  for  sets,  than  the  intersection  of  two  types 
need  not  stand  for  the  intersection  of  the  two  corresponding  sets.  In  earlier  papers,  we  used 
the  term  “conjunctive  type”,  but  this  was  equally  misleading  in  other  contexts,  and  never 
became  widely  accepted.) 

The  basic  idea  is  to  introduce  a  type  constructor  &,  with  the  interpretation  that  a  phrase 
has  type  9i  h  02  if  and  only  if  it  has  both  type  0i  and  type  02.  This  interpretation  leads  to 
the  subtype  laws 

01  &  02  ^  01 
01  &  02  ^  02 

If  0  <  01  and  0  <  02  then  0  <  0i  &  02  , 

which  assert  that  0i  &  02  is  a  greatest  lower  bound  of  0i  and  02.  (Note  that  the  introduction 
of  the  intersection  operation  makes  the  subtype  relation  a  preorder  rather  than  a  partial 
order,  since  one  can  have  distinct  types,  such  as  0i  &  02  and  02  &:  0i,  each  of  which  is  a 
subtype  of  the  other.  In  this  situation,  we  will  say  that  the  types  are  equivalent.) 

We  will  see  that  intersection  types  provide  the  ability  to  define  procedures  with  more 
than  one  type.  For  example 


procedure  poly{x)\  x  x  x-\-2 

can  be  given  the  type  (intexp  ^  intexp)  h  (realexp  realexp).  At  present,  however, 
the  main  point  is  that  intersection  can  be  used  to  simplify  the  structure  of  types. 
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First,  the  various  types  of  variables  can  be  regarded  as  intersections  of  expressions  and 
acceptors.  For  example,  intvar  is  intexp  &  intacc,  realvar  is  realexp  &  realacc,  and 
(int,  real)  var  is  realexp  &:  intacc. 

Second,  a  product  type  with  more  than  one  field  can  be  regarded  as  an  intersection  of 
product  types  with  single  fields.  Thus,  instead  of 

( 5  ■  ■  •  5  )  5 


one  writes 


Ln'.  On  . 


Note  that  the  field-forgetting  relationship  becomes  a  consequence  of  6\k62  ^ 


A  final  simplification  concerns  acceptors.  The  meaning  of  a  <5  acceptor  a  (for  any  data  type 
6)  is  completely  determined  by  the  meanings  of  the  commands  a  :=  e  for  all  6  expressions 
e.  Thus  a  has  the  same  kind  of  meaning  as  a  procedure  of  type  ^exp  — >  comm.  As  a 
consequence,  we  can  regard  ^acc  as  an  abbreviation  for  ^exp  — >  comm,  and  a  :=  e  as  an 
abbreviation  for  a(e).  (As  discussed  in  Section  8,  this  treatment  of  assignment  as  procedure 
call  is  a  controversial  generalization  of  the  usual  concept  of  assignment.) 


3.  Types  and  the  Subtype  Relation 


Having  sketched  its  evolution,  we  can  now  define  the  type  system  of  Forsythe  precisely.  The 
sets  of  data  types,  primitive  (phrase)  types,  and  (phrase)  types  can  be  defined  by  an  abstract 
grammar: 

6  ::=  int  |  real  |  bool  [  char  (data  types) 

p  6  \  value  |  comm  [  compl  (primitive  types) 

0  ::=  p\9  0  \  i\0  \jis\0  k  6  (types) 

where  the  metavariable  l  ranges  over  identifiers. 


Here  there  are  three  changes  from  the  previous  section.  Expression  types  are  now  named 
by  their  underlying  data  types;  for  example,  intexp  is  now  just  int.  A  new  primitive  type 
value  stands  for  the  union  of  all  the  data  types;  its  utility  will  become  apparent  in  Section 
4.  Finally,  a  new  phrase  type  ns  (for  “nonsense”)  has  been  introduced;  it  is  possessed  by  all 
(parsable)  phrases  of  the  language,  and  can  be  viewed  as  a  unit  for  the  operation  &,  i.e.  as 
the  intersection  of  the  empty  set  of  types. 


The  subtype  relation  <prim  for  primitive  types  is  the  partial  order 

value 


comm 

real  bool  char 

compl 


int 
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For  types,  <  is  the  least  preorder  such  that 


^  <  ns 

01  &  ^2  ^  01 
01  &  02  ^  02 

If  0  <  01  and  0  <  02  then  0  <  0i  &:  02 
If  P  <prim  p'  then  P<  p' 

If0<0'  then  f.e<L:  0' 

If  6[  <  01  and  02  <  02  then  0i  02  <  0^  ^  02 
l:  01  &:  i:  02  <  r.  (0i  h  62) 

(0  ^  0i)  &  (0  ->  02)  <  0  ^  (01  &  02) 

ns  <  t:  ns 
ns  <  0  —1  ns  . 

We  write  0  ~  0',  and  say  that  0  and  9'  are  equivalent,  when  6  <  O'  and  0'  <  0.  The  first  four 
relationships  establish  that  ns  is  a  greatest  type  and  that  0i  &  02  is  a  greatest  lower  bound 
of  01  and  02.  Note  that  we  say  “a”  rather  than  “the”;  neither  greatest  types  nor  greatest 
lower  bounds  are  unique,  since  we  have  a  preorder  rather  than  a  partial  order.  However,  any 
greatest  type  must  be  equivalent  to  ns,  and  any  greatest  lower  bound  of  0i  and  02  must  be 
equivalent  to  0i  &  02. 

The  fact  that  ns  is  a  greatest  type  and  &  is  a  greatest  lower  bound  operator  has  the 
following  consequences; 

01  &  (02  &  03)  (01  &  O2)  &  03 

0  &  ns  ~  0 
ns  &0  ~  0 
01  &  02  ~  02  &  01 
0&0  ~  0 

If  01  <  01  and  02  <  02  then  0i  &;  02  <  0i  &:  02 
0  <  01  &  02  iff  0  <  01  and  0  <  02  . 

The  next  three  relationships  in  the  definition  of  <  assert  that  primitive  types  are  related 
by  <prim,  that  the  object-type  constructor  is  monotone,  and  that  is  antimonotone  in 
its  first  operand  and  monotone  in  its  second  operand.  The  last  four  relationships  have  the 
following  consequences; 

i:  (01  &  02)  ^  i.  01  &  f.  02 

0  ^  (01  &  02)  ~  (0  ^  0i)  &  (0  ^  02) 
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t:  ns  ~  ns 


^  ^  ns  c±  ns  . 

The  first  two  of  these  equivalences  show  that  intersection  distributes  with  object  constructors 
(modulo  ~)  and  with  the  right  side  (but  not  the  left)  of  — The  last  two  equivalences  are 
analogous  laws  for  the  intersection  of  zero  types. 

It  can  be  shown  that  every  pair  of  types  has  a  least  upper  bound  (which  is  unique  modulo 
~).  In  particular,  the  following  equivalences  suffice  to  compute  a  least  upper  bound,  6i  U  02, 
of  any  types  6i  and  O2'. 

U  02  ^  ^2  U  01 
0  U  ns  ~  ns 

01  U  (02  &  03)  —  (01  U  02)  &  (01  U  03) 

/9  U  0  ~  ns 

p  U  (01  ^  02)  ~  ns 
L\  01  U  (02  — ^  6z)  —  ns 

Pi  U  P2  ^  Pi  Uprim  P2  when  pi  UpHm  P2  exists 
Pi  U  p2  ~  ns  when  pi  Uprim  02  does  not  exist 

i:0iUi;02~t:  (0i  U  02) 

ti;  01  U  cz'-  ^2  —  ns  when  ii  ^  12 
(01  0;)  U  (02  ^  0' )  ~  (01  k  02)  ^  (0i  u  0' )  . 

The  types  int,  real,  bool,  char,  value,  comm,  compl,  and  ns  are  actually  predefined 
type  identifiers  (whose  meaning  can  be  redefined  by  the  lettype  definition  to  be  discussed 
later,  but  which  take  on  standard  meanings  outside  of  such  redefinitions).  Additional  pre¬ 
defined  type  identifiers  are  provided  to  abbreviate  various  commonly  occurring  nonprimitive 
types.  As  discussed  in  the  previous  section,  when  6  is  any  of  the  character  sequences  int, 
real,  bool,  or  char  that  denote  data  types, 

r  def  c 

^acc  =  0  comm 

(e.g.  intacc  int  —y  comm),  and 

0var  0  &  0acc  . 

There  are  also  abbreviations  for  commonly  occurring  types  of  sequences.  In  general,  a 
sequence  s  of  element  type  0  and  length  n  is  an  entity  of  type  (int  0)  k  len:  int  such 
that  the  value  of  s.len  is  n  and  the  application  s  i  is  well-defined  for  all  integers  i  such  that 
0  <i  <  n.  (Of  course,  the  proviso  on  definedness  is  not  implied  by  the  type  of  the  sequence.) 
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The  following  type  identifiers  are  predefined  to  abbreviate  specific  types  of  sequences: 

^seq  (int  6)  Sz  len:  int 

^accseq  (int  — >  ^acc)  &  len:int 

^varseq  (int  ^  6var)  &  len:  int 
commseq  (int  — >  comm)  &  len:  int 

complseq  (int  — compl)  &  len:  int  . 

For  instance,  a  ^varseq,  in  Algol  terminology,  is  a  one-dimensional  8  array  with  a  lower 
bound  of  zero  and  an  upper  bound  one  less  than  its  length.  A  ^seq  is  a  similar  entity  whose 
elements  can  be  evaluated  but  not  assigned  to,  and  a  (5accseq  is  a  similar  entity  whose 
elements  can  be  assigned  to  but  not  evaluated.  For  example,  charseq  is  the  type  of  string 
constants. 


4.  The  Semantics  of  Types 


To  describe  the  meaning  of  types,  we  will  employ  some  basic  concepts  from  category  theory. 
The  main  reason  for  doing  so  is  that,  by  formulating  succinct  definitions  in  terms  of  a 
mathematical  theory  of  great  generality,  we  gain  an  assurance  that  our  language  will  be 
uniform  and  general. 

A  second  reason  is  that  the  abstract  concept  of  a  category  establishes  a  bridge  between 
intuitive  and  rigorous  semantics.  Intuitively,  we  think  of  a  type  as  standing  for  a  set,  and 
an  implicit  conversion  as  a  function  from  one  such  set  to  another.  But  since  our  language 
permits  nonterminating  programs,  types  must  denote  domains  (i.e.  complete  partial  orders 
with  a  least  element)  and  implicit  conversions  must  be  continuous  functions.  Moreover,  a 
further  level  of  complication  arises  when  one  develops  a  semantics  that  embodies  the  block 
structure  of  Algol-like  languages;  then  types  denote  functors  and  implicit  conversions  are 
natural  transformations  between  such  functors  [5,  16,  17]. 

However,  the  choice  between  these  three  different  views  is  simply  a  choice  between  three 
different  “semantic”  categories: 

•  SET  —  in  which  the  objects  are  sets,  and  the  set  of  morphisms  S  ^  S'  is  the  set  of 
functions  from  S  to  S'. 

•  DOM  —  in  which  the  objects  are  domains,  and  D  D'  is  the  set  of  continuous 
functions  from  D  to  D'. 

•  PDOM^  —  in  which  the  objects  are  functors  from  a  category  S  of  “store  shapes”  to 
the  category  PDOM  of  predomains  (complete  partial  orders,  possibly  without  least 
elements)  and  continuous  functions,  and  F  F'  is  the  set  of  natural  transformations 
from  F  to  F'. 


12 


Therefore,  if  we  formulate  the  semantics  of  types  in  terms  of  an  arbitrary  category,  assuming 
only  properties  that  are  possessed  by  all  three  of  the  above  categories  (i.e.  being  Cartesian 
closed  and  possessing  certain  limits),  then  we  can  think  about  the  semantics  in  the  intuitive 
setting  of  sets  and  functions,  yet  be  confident  that  our  semantics  makes  sense  in  a  more 
rigorous  setting. 

Thus  we  will  define  types  in  terms  of  an  unspecified  semantic  category,  while  giving 
explanatory  remarks  and  examples  in  terms  of  the  particular  category  SET  (or  occasionally 
DOM). 

For  each  type  0,  we  write  |^J  for  the  object  (e.g.  set)  denoted  by  6.  Whenever  9  <  9', 
we  write  <  0']  for  the  implicit  conversion  morphism  (e.g.  function)  from  |0]  to  |0'].  Two 
requirements  are  imposed  on  these  implicit  conversion  morphisms: 


•  For  all  types  9,  the  conversion  from  |0]  to  [0]  must  be  an  identity; 


[^  <  =  Iiej  ■ 


•  Whenever  9  <  9'  and  9'  <  9" ,  the  composition  of  \9  <  0']  with  \9'  <  9"J  must  equal 
\9  <  9'%  i.e.  the  diagram 


must  commute. 


These  requirements  coincide  with  a  basic  concept  of  category  theory:  |— ]  must  be  a  functor 
from  the  preordered  set  of  types  (viewed  as  a  category)  to  the  semantic  category. 

The  above  requirements  determine  the  semantics  of  equivalence.  When  9  ~  9\  the 
diagrams 


1^1  I^'I 


both  commute,  so  that  |0]  and  [0']  are  isomorphic,  which  we  denote  by  |0]  fti  |^'].  (Note, 
however,  that  nonequivalent  types  may  also  denote  isomorphic  objects.) 

Next,  we  define  (up  to  isomorphism)  the  meaning  of  each  type  constructor: 
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•  Procedures  —  To  define  -4,  we  require  the  semantic  category  to  be  Cartesian  closed, 

and  define  — >  6'\  to  be  |0]  |^'],  where  =>  denotes  the  exponentiation  opera¬ 
tion  in  the  semantic  category.  In  SET  (DOM),  |0]  |0']  is  the  set  (domain)  of  all 

(continuous)  functions  from  |0]  to  |^'|. 

•  Object  Constructors  —  We  define  [i:^]  to  be  an  object  that  is  isomorphic  to  |0]. 

•  Nonsense  —  We  define  |nsj  to  be  a  terminal  object  T,  i.e.  an  object  such  that,  for 
any  object  s,  there  is  exactly  one  morphism  from  s  to  T.  In  SET  or  DOM  a  terminal 
object  is  a  set  containing  one  element.  (Thus  even  nonsense  phrases  have  a  meaning, 
but  they  all  have  the  same  meaning.) 

•  Intersection  —  Because  of  its  novelty,  we  describe  the  meaning  of  intersection  in  more 
detail  than  the  other  type  constructors.  Basically,  the  meaning  of  6x^62  is  determined 
by  the  meanings  of  61,  62-,  and  their  least  upper  bound  9\  U  62-  From  6\  &  ^2;  we 
can  convert  to  Oy  and  from  there  to  U  ^2)  or  we  can  convert  to  62  and  from  there 
to  \J  62]  clearly  the  two  compositions  of  conversions  should  be  equal.  Moreover, 
whenever  9  <  9i  Sz  92,  the  composite  composition  from  ^  to  &  ^2  to  9i  should  equal 
the  direct  conversion  from  9  to  9i,  and  similarly  for  02-  In  other  words,  in  the  diagram 


1^1  LJ  ^2] 


1^1 


the  inner  diamond  must  commute  and,  for  all  9  such  that  9  <  9ik 92,  the  two  triangles 
must  commute. 

However,  these  requirements  are  not  sufficient  to  determine  |0i  k  02l-  To  strengthen 
them,  we  replace  [0]  by  an  arbitrary  object  s  and  {9  <  0i]  and  |6I  <  ^2!  by  any  functions 
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/i  and  /2  that  make  the  outer  diamond  commute,  and  we  require  the  “mediating 
morphism”  from  s  to  |0i  &  ^2]  to  be  unique.  Specifically,  we  define  |^i  &  ^2!  by 
requiring  that,  in  the  diagram 


1^1  U  02  ] 


s 


the  inner  diamond  must  commute  and,  for  all  objects  s  and  morphisms  /i  and  /2 
that  make  the  outer  diamond  commute,  there  must  be  a  unique  morphism  from  s  to 
|0i  &  ^2!  that  makes  the  two  triangles  commute. 

Clearly,  this  strengthening  is  something  of  a  leap  of  faith.  Thus  it  is  reassuring  that 
our  definition  coincides  with  a  standard  concept  of  category  theory:  we  have  defined 
|0i  k  02l  to  be  the  pullback  of  |0ij,  [02l,  and  |0i  U  ^2]  (which  is  unique  up  to  isomor¬ 
phism). 

For  sets  or  domains,  the  pullback  is 

101  &  02l  Ri  {  {x-L,X2)  xi  e  [0i]  and  X2  e  |02]  and  |0i  <  0i  U  02]a:i  =  |02  <  0i  U  02]a;2  }  . 

(For  domains,  one  must  require  all  implicit  conversion  functions  to  be  strict.)  In  other 
words,  a  meaning  of  type  0i  &  02  is  a  meaning  of  type  0i  paired  with  a  meaning  of  type 
02,  subject  to  the  constraint  that  these  meanings  must  convert  to  the  same  meaning 
of  type  01  U  02. 

The  following  are  special  cases  of  the  definition  of  intersection.  Although  we  describe  these 
cases  in  terms  of  SET  and  DOM,  basically  similar  results  hold  for  any  semantic  category 
that  is  Cartesian  closed  and  possesses  the  pullbacks  necessary  to  define  intersection. 
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•  If  U  02  —  ns  then  the  constraint 

|0i  <  01  U  02la;i  —  [02  ^  ^1  LI  02]3;2 

always  holds,  since  both  sides  of  the  equation  belong  to  the  one-element  set  [ns].  Thus 

[01  &  02|  ~  [0l]  X  [02]  . 

For  example, 

[intvar]  =  [int  &(int  ^  comm)]  fa  [int|  x  [int  comm] 

li:  01  &  (02  ^  03)1  ~  h:  0il  X  [02  ^  03]  ~  [0il  X  [02  ^  03] 
and,  when  ii  ^  t2, 

[ii:01  &  i2:02|  ~  [il'^l]  X  [i2:02l  ~  [^l]  X  [02]  . 

•  If  01  ^  02)  so  that  01  U  02  =  02)  then 

[01  &  02]  «  {  (xi,  X2)  xi  e  [0i]  and  X2  e  [02]  and  [0i  <  02]xi  =  3:2  }  !=a  [0il  . 

For  example, 

[int  &  real]  «  [int] 

[compl  &  comm]  «  [compl] . 

•  If  [0i]  and  [02]  are  subsets  of  [0iL102],  and  |0i  <  0iU02]  and  [02  <  0iU02]  are  identity 
injections,  then 

[01  &:  02]  Ri  {  {xi,X2)  Xi  6  [0i]  and  X2  e  [02]  and  Xi  =  a;2  }  ~  [0i]  H  [02]  . 

In  this  special  case,  the  intersection  of  types  does  correspond  to  the  intersection  of 
sets. 

An  example  of  this  case  arises  when  0i  and  O2  are  data  types  with  no  implicit  conversion 
between  them,  such  as  int  and  char.  In  this  case,  their  least  upper  bound  is  value, 
which  stands  for  the  union  of  the  data  types,  so  that  the  implicit  conversions  into 
value  are  identity  injections.  Thus, 

[int  k,  char]  fa  [int]  fl  [char]  . 

This  is  the  purpose  of  introducing  the  type  value.  Had  we  not  done  so,  we  would  have 
int  U  char  =  ns,  which  would  give  [int  &  char]  fa  [int]  x  [char]. 

In  Forsythe,  the  sets  denoted  by  the  data  types  real,  bool,  and  char  are  disjoint  (and 
the  set  denoted  by  int  is  a  subset  of  that  denoted  by  real),  so  that  intersections  such 
as  int  k  char  denote  the  empty  set.  However,  this  is  a  detail  of  the  language  design, 
while  the  preceding  argument  is  more  general. 
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The  above  arguments  are  based  on  the  intuition  that  data  types  denote  sets  of  values. 
In  fact,  however,  data  types  such  as  int  and  char,  when  they  are  used  as  phrase 
types,  denote  sets  of  meanings  appropriate  to  kinds  of  expressions.  Specifically,  in  a 
domain-theoretic  model, 

|int|  =  S  ^  Z_L  [char|  =  S'  — >  Ci  fvalue]  =  S'  — >■  14  , 

where  S  is  the  set  of  states;  Zx,  Cx,  and  V±  are  set  of  integers,  characters,  and  values, 
made  into  fiat  domains  by  adding  a  least  element;  and  Z  and  C  are  subsets  of  V.  Even 
in  this  richer  setting,  however,  it  is  still  true  that  the  conversion  functions  from  [int] 
and  [char]  into  [value]  are  identity  injections. 


•  Finally,  we  consider  the  intersection  of  procedural  types.  First,  we  must  define  the 
implicit  conversions  between  such  types.  If  6[  <  9i  and  $2  <  O2  then  the  conversion  of 
/  e  [01  ^2]  to  [0[  — )■  0y  is  obtained  by  composing  /  with  appropriate  conversions 

of  its  arguments  and  results; 


|01  ^  02  <  0; 


19[  <  0i] 
[^:1 


/ 


1^2  <  0y 

[^2] 


or  as  an  equation. 


02<0;^0y/  =  i0;<0i];/;[02<0^] 


where  ;  denotes  composition  in  diagrammatic  order. 

From  the  definition  of  intersection,  by  substituting  the  equation  for  the  least  upper 
bound  of  two  procedural  types,  and  using  the  above  equation  for  the  implicit  conversion 
of  procedural  types,  we  obtain 


[(01  ^  0;)  &  (02  0y] « 

{  (/ij  /2)  fi  e  [01  0i]  and  /2  e  [02  — >  0^ 

and  [01  0i  <  (01  k  02)  ^  (0i  U  0y|/i  =  [02  ^  0'  <  (01  &  02)  ^  (0i  U  0^1/2  } 

=  { {h^h)  fi  e  [01  0'i]  and  /2  e  [02  0y 

and  [01  &  02  <  0i] ;  /i ;  [0'i  <  0i  U  0y  =  [01  &  02  <  02] ;  /2 ;  [0'  <  0[  U  0^  }  . 

Here  the  constraint  on  fi  and  /2  is  the  commutativity  of  a  hexagon: 


?1  &  02  4  01 

[01  02 
\ 

?1  &  02  ^  02 


1  <  0(  u  0y 

1 ^2! 

2  <  0[  u  0y 


/2 


i^y 
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This  constraint  implies  that  the  “versions”  of  a  procedure  whose  type  is  an  intersection 
must  respect  implicit  conversions  —  which  is  what  distinguishes  such  a  procedure  from 
the  usual  notion  of  a  “generic”  procedure.  For  example,  since  int  &  real  =  int  and 
int  U  real  =  real  (taking  =  rather  than  ~  here  simplifies  the  argument), 


[(int 

{(/l./2> 


int)  k  (real 

I  fi  e  [int  - 


4  real)]  « 

int]  and  /2  e  [real  — ^  real] 

and  /i  ;  [int  <  real]  =  [int  <  real]  ;  /2  }  • 


Here  the  hexagon  collapses  into  a  rectangle,  so  that  fi  and  /2  must  satisfy 

fi 


[int] 


[int] 


[int  <  real] 


[real] 


/2 


[int  <  real] 


[real] 


On  the  other  hand, 

[(int  ^  int)  (char  char)]  «  [int  —>  int]  x  [char  char]  , 

since  in  this  case  the  hexagonal  constraint  on  /i  and  fi  is  vacuously  true  because 
[int  &  char]  is  the  empty  set. 


5.  Phrases  and  their  Typings 


We  now  introduce  the  phrases  of  Forsythe  and  give  rules  for  determining  their  types.  Specif¬ 
ically,  we  will  give  inference  rules  for  formulas  called  typings. 

A  type  assignment  is  a  function  from  a  finite  set  of  identifiers  to  types.  If  tt  is  a  type 
assignment,  then  [tt  |  l’.9  \  denotes  the  type  assignment  whose  domain  is  domTr  U  {i},  such 
that  [tt  I  l:6]l  =  6  and  [tt  |  =  irt'  when  i'  i.  We  write  [tt  |  l\.9\  [  •••  |  to 

abbreviate  [•  •  •  [tt  |  l\\9\]  -  •  •  |  in'-9n]- 

If  TT  is  a  type  assignment,  p  is  a  phrase,  and  0  is  a  type,  then  the  formula  tt  h  p  :  0,  called 
a  typing,  asserts  that  the  phrase  p  has  the  type  9  when  its  free  identifiers  are  assigned  types 
by  TT. 

An  inference  rule  consists  of  zero  or  more  typings  called  premisses  followed  (after  a  hori¬ 
zontal  line)  by  one  or  more  typings  called  conclusions.  The  rule  may  contain  metavariables 
denoting  type  assignments,  phrases,  identifiers,  or  types;  an  instance  of  the  rule  is  obtained  by 
replacing  these  metavariables  by  particular  type  assignments,  phrases,  identifiers,  or  types. 
(Some  rules  will  have  restrictions  on  the  permissible  replacements.)  The  meaning  of  a  rule 
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is  that,  for  any  instance,  if  all  the  premisses  are  valid  typings  then  all  of  the  conclusions  are 
valid  typings. 

First,  we  have  rules  describing  the  behavior  of  subtypes,  the  nonsense  type,  and  intersec¬ 
tions  of  types: 

•  Subtypes  (often  called  subsumption) 

'K  p  :  d 

-  when  6  <  9' 

TT  \-  p  :  6' 

•  Nonsense  _ 

TT  h  p  :  ns 

•  Intersection 

TT  \-  p  :  9i 
T  \-  p:  O2 

TT  \-  p  :  01  h  62 

Then  there  are  rules  for  typing  identifiers,  applications  (procedure  calls),  and  conditional 
phrases: 

•  Identifiers  _ 

TT  h  <• :  7r(i)  when  i  €  dom  tt 

•  Applications 

TV  h  Pi  :  9  9' 

TT  \-  P2:  9 

Tr\-  Pi  P2:  9' 

•  Conditionals 

IT  \~  Pi  :  bool 
TT  P2  :  9 
7r\-  ps:  9 

TT  h  if  Pi  then  p2  else  pa  :  9 

Notice  that  the  conditional  construction  is  applicable  to  arbitrary  types. 

Next  we  consider  abstractions  (sometimes  called  lambda  expressions),  which  are  used 
to  denote  procedures.  Here  there  are  two  cases,  depending  upon  whether  the  type  of  the 
argument  to  the  procedure  is  indicated  explicitly.  In  the  explicit  case  we  have: 

•  Abstractions  (with  explicit  typing) 

[w  \  l:  9i]  p  :  0' 

TT  F  (Ai:  9i  ]  ■  ■  •  \  9n.  p)  :  9i  — >  9' 
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In  this  rule,  notice  that  ${  must  be  one  of  a  list  of  types  appearing  explicitly  in  the  abstraction 
(separated  by  the  alternative  operator  I).  For  example,  under  any  type  assignment,  the 
abstraction 

\x:  int.  X  X  X  +  2 

has  type  int  int,  while  the  abstraction 

Xx:  int  I  real,  x  x  x  +  2 

has  both  type  int  int  and  type  real  — >  real,  so  that,  by  the  rule  for  intersection,  it  also 
has  type  (int  — »•  int)  &  (real  real).  (Note  the  role  of  the  colon,  which  is  always  used  in 
Forsythe  to  specify  the  types  of  identifiers  or  phrases.) 

In  contrast,  in  the  case  where  the  argument  is  implicitly  typed,  we  have: 

•  Abstractions  (with  implicit  typing) 

[tt  \  i:  6]  \-  p  :  9' 

TT  h  (Ai.  p)  :  6  ^  O' 

Here  the  abstraction  provides  no  explicit  constraints  on  the  type  6.  For  example,  for  the 
abstraction  Xx.  x  x  x  +  2,  one  can  use  this  rule  to  infer  either  of  the  types  int  — >  int  or 
real  real.  More  vividly,  for  the  abstraction  Ax.  x,  one  can  infer  any  type  of  the  form 

e->e. 

At  this  point,  one  might  ask  why  one  would  ever  use  explicit  typing.  Sensible  answers 
are  to  make  the  program  more  readable,  or  to  insure  that  a  procedure  has  the  typing  one 
expects,  rather  than  just  some  typing  that  makes  the  overall  program  type-correct.  But  a 
more  stringent  answer  is  that  it  has  been  proven  that  there  is  no  algorithm  that  can  typecheck 
an  arbitrary  implicitly  typed  program  in  the  intersection  type  discipline  [13,  14,  15].  Thus 
the  Forsythe  implementation  requires  some  explicit  type  information  to  be  provided.  The 
exact  nature  of  this  requirement  is  described  in  Appendix  C. 

Next  there  are  constructions  for  denoting  objects  and  selecting  their  fields: 

•  Object  Construction 

TT  \-  p  :  6 

TT  h  (i  =  p)  :  (i:  0) 

•  Field  Selection 

TT  \-  p  :  {i:6) 

TT  \-  p.i  :  9 

The  first  of  these  forms  denotes  objects  with  only  a  single  field;  objects  with  several  fields 
can  be  denoted  by  the  merge  construction,  which  will  be  described  later.  Note  the  role  of 
the  connective  =,  which  is  always  used  to  connect  identifiers  with  their  meanings. 

Then  comes  a  long  list  of  rules  describing  various  types  of  constants  and  expressions: 
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•  Constants 


TT  h  (nat  const)  :  int 


TT  h  (real 

const)  :  real 

TT  h  (char 

const)  :  char 

TT  h  (string  const)  :  charseq 

Arithmetic  Expressions 

TT  h  p  :  int 

TT  h  p  :  real 

TT  h  4-p  :  int 

TT  h  +p  :  real 

TT  h  — p  :  int 

TT  h  — p  :  real 

TT  h  Pi  :  int 

TT  h  p2  :  int 

TT  h  Pi  :  real 

TT  h  Pi  +  p2  :  int 

TT  h  p2  :  real 

TT  h  Pi  —  p2  :  int 

TT  h  Pi  +  p2  :  real 

TT  h  Pi  X  p2  :  int 

TT  h  Pi  —  p2  :  real 

TT  h  Pi  ^  P2  :  int 

TT  h  Pi  X  p2  :  real 

TT  h  Pi  renip2  :  int 

TT  h  Pi  =t==t=p2  :  int 

TT  h  pi/p2  :  real 

TT  h  Pi 

:  real 

TT  h  P2 

:  int 

TT  h  Pi  T  p2  :  real 

Relations 

TT  h  Pi  :  real 

TT  h  Pi  : 

char 

TT  h  p2  :  real 

TT  h  p2  : 

char 

TT  h  Pi  =  P2 
TT  H  Pi  7^  P2 
TT  I-  Pi  <  P2 
TT  H  Pi  <  P2 
TT  H  Pi  >  P2 


bool 

bool 

bool 

bool 

bool 


T!"  1“  Pi  >  P2  :  bool 


TT  H  Pi  =  P2 
Pi  ^P2 

TT  h  Pi  <  p2 
TT  H  Pi  <  P2 
TT  I-  Pi  >  P2 
TT  H  Pi  >  P2 


bool 

bool 

bool 

bool 

bool 

bool 


TT  h  Pi  :  bool 
TT  h  p2  :  bool 

TT  h  Pi  =  p2  :  bool 
TT  h  Pi  7^  P2  :  bool 
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•  Boolean  Expressions 

TT  h  p  :  bool 
TT  h  ~p  :  bool 

TT  h  Pi  :  bool 
TT  h  p2  :  bool 

TT  h  Pi  A  p2  :  bool 
TT  h  Pi  V  p2  :  bool 
TT  h  Pi  =>  p2  :  bool 
TT  h  Pi  44^  p2  :  bool 

Here  the  only  real  novelty  is  the  provision  of  two  operators  for  exponentiation:  "f  accepts  a 
real  and  an  integer,  and  yields  a  real,  while  **  accepts  two  integers,  and  yields  an  integer 
(giving  an  error  stop  if  its  second  operand  is  negative).  The  boolean  operators  and  ^ 
denote  implication  and  equivalence  (if-and-only-if)  respectively. 

As  in  Algol,  the  semicolon  denotes  sequential  composition  of  commands.  But  now  it  can 
also  be  used  to  compose  a  command  with  a  completion,  giving  a  completion: 

•  Sequential  Composition 

TT  h  Pi  :  comm  tt  h  pi  :  comm 

TT  h  p2  :  comm  tt  h  p2  :  compl 

^  Pi  '1P2  •  comm  TT  h  Pi  ;  p2  :  compl 

Two  iterative  constructions  are  provided:  the  traditional  while  command,  and  a  loop  con¬ 
struction  which  iterates  its  operand  ad  infinitum  (i.e.  until  the  operand  jumps  out  of  the 
loop  by  executing  a  completion): 

•  while  Commands 

TT  h  Pi  :  bool 
TT  h  p2  :  comm 

TT  h  while  Pi  dop2  :  comm 

•  loop  Completions 

TT  h  p  :  comm 
TT  h  loopp  :  compl 

There  is  also  a  phrase  whose  execution  causes  an  error  stop,  with  a  message  obtained  by 
evaluating  a  character  sequence: 

•  Error  stops 

TT  h  p  :  charseq 
TT  h  errorp  :  0 
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Notice  that  error  p  is  a  phrase  that  can  have  any  type. 

In  place  of  the  procedure  declarations  of  Algol,  Forsythe  provides  the  more  general  let- 
definition  construct  invented  by  Peter  Landin.  In  the  implicitly  typed  case; 

•  Nonrecursive  Definitions  (with  implicit  typing) 

TT  h  Pi  : 


TT  h  p„  :  0„ 

[tt  I  ti;0i  1  I  h  p  :  0 

TT  h  let  ti  =  pi,  p„  inp  :  0 


For  example,  in  place  of  the  Algol  block 

begin  procedure  p(a;);  0  re;  Bpj-oc,  B  end  , 

one  can  write 

let  p  =  Arc:  0.  .Bproc  in  B  . 

Such  definitions  are  not  limited  to  procedures.  One  can  write 

let  rr  =  3  in  5  , 


which  will  have  exactly  the  same  meaning  as  the  phrase  obtained  from  B  by  substituting  3 
for  X.  Note,  however,  that  this  is  not  a  variable  declaration;  rc  has  the  type  int  (the  type  of 
3)  and  cannot  be  assigned  to  within  B.  Moreover,  if  y  is  an  integer  variable  then 

let  X  =  yin  B 


has  the  same  meaning  as  the  phrase  obtained  from  B  by  substituting  y  for  rc,  i.e.  rc  is  defined 
to  be  an  alias  of  y. 


Definitions  can  also  be  explicitly  typed,  indeeed  one  can  mix  implicit  and  explicit  typing 
in  the  same  let-construction.  To  describe  this  situation,  we  adopt  the  convention  that, 
when  an  inference  rule  contains  the  notation  {•  •  •}’,  it  stands  for  two  rules,  obtained  (i) 
by  deleting  the  notation,  and  (ii)  by  replacing  it  by  the  contents  of  the  braces.  (When  the 
notation  occurs  n  times,  the  rule  stands  for  the  2"  rules  obtained  by  taking  all  possible 
combinations.)  Using  this  notation,  we  have  a  general  rule  that  includes  the  previous  one  as 
a  special  case. 


•  Nonrecursive  Definitions 

TT  h  Pi  :  01 


TT  h  p„  :  0„ 

[tt  I  ti:0i  I  •  •  •  [  c,,:0„]  h  p  ;  0 _ _ 

TT  h  let  ii{:0i}-  =  pi,...,t„{;0„}-  =p„inp  :  0 
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Here,  the  explicit  occurrence  of  :  6i  in  the  definition  constrains  the  type  6i  to  be  used  in 
an  application  of  the  rule,  while  the  absence  of  such  an  occurrence  allows  9i  to  be  chosen 
arbitrarily.  (Notice  that  the  alternative  operator  I  cannot  be  used  in  let  constructions,  nor 
in  the  recursive  definitions  discussed  below.) 

For  example,  in  place  of  the  procedure  definition  displayed  earlier,  one  can  specify  the  type 
of  the  procedure  in  the  definition  (instead  of  specifying  the  type  of  the  procedure  argument 
in  the  abstraction): 

let  p:  6  ^  comm  =  Xx.  Bp^oc  in  B  . 

There  is  also  an  alternative  form  of  the  nonrecursive  definition, 

letmlineii{:  =  pi, . . . ,  t„{:  =p„inp, 

that  has  the  same  typing  behavior  and  semantics  as  the  let  form,  but  causes  the  procedures 
or  other  entities  being  defined  to  be  compiled  into  inline  code. 

Recursion  is  provided  in  two  ways.  On  the  one  hand,  there  is  a  fixed-point  operator  rec: 

•  Fixed  Points 

TT  p  :  0  0 

TT  F  rec:  0.  p  :  0 

Notice  that  explicit  typing  is  required  here. 

On  the  other  hand,  there  is  a  form  of  recursive  definition: 

•  Recursive  Definitions 

[tt  I  ii:0i  I  •  •  •  I  h  Pi  :  6>i 


[7r|ii:0i|- 

*  1  ^71 '  Pn  • 

[tt  1  ti:0i  1  • 

^  \  tn-0n]\-  p  :  0 

TT  h  letrecii:  ^1, . . . ,  where  ii  =  pi,  p„  inp  :  0 


In  this  form,  the  recursively  defined  identifiers  and  their  types  must  be  listed  before  the 
definitions  themselves,  so  that  the  reader  (and  compiler)  knows  that  these  identifiers  have 
been  rebound,  and  what  their  types  are,  before  reading  any  of  the  pi. 

To  keep  the  above  rule  simple,  we  have  assumed  that  the  two  lists  in  a  recursive  definition 
define  the  identifiers  ii,  . . .,  in  the  same  order.  In  fact,  however,  the  order  of  the  items  in 
each  of  the  lists  is  arbitrary.  However,  for  both  the  recursive  and  nonrecursive  definitions, 
ii,  . . .,  In  must  be  distinct  identifiers. 

Next  we  consider  a  construction  for  intersecting  or  “merging”  meanings.  Suppose  pi  has 
type  01,  p2  has  type  02,  and  ^i  U  ^2  —  ns,  so  that  |0i  k  ^2!  ~  I^i]  x  |02l.  One  might  hope 
to  write  pi,p2  to  denote  a  meaning  of  type  &  ^2- 
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Unfortunately,  this  conflicts  with  the  behavior  of  subtypes,  since  pi  and  p2  might  have 
types  6'^  and  such  that  6'^  <  9i  and  <  02  but  O'^UO'^  9^  ns.  For  example,  although 
(a  =  3,  6  =  4)  and  6  =  5  respectively  have  types  a:  int  and  6:  int,  whose  least  upper  bound 
is  ns,  the  phrase 

(a  =  3, 6  =  4),  6  =  5 

would  be  ambiguous. 

Our  solution  to  this  problem  is  to  permit  pi,p2  only  when  p2  is  an  abstraction  or  an  object 
construction,  whose  meaning  then  overwrites  all  components  of  the  meaning  of  p\  that  have 
procedural  types,  or  all  object  types  with  the  same  field  name.  The  inference  rules  are: 


•  Merging 


/K  \  L\6i]\-  p2  :  O' 

TT  t-  (pi,  Ai{:  1  •  •  •  I  P2)  :  Oi  O' 

TT  F  Pi  :  p _ 

TT  h  (pi,At{:0i  !•••  I0n}'-P2)  :  P 

TT  h  Pi  :  (4i:0) _ 

TT  h  (pi,  Ai{:  01  I  •  •  •  I  0n}\  P2)  :  (ti:  0) 
;r  h  p2  :  0 

TT  h  (pi,t  =  p2)  :  {t:0) 


TT  h  Pi  :  p 

TT  h  (pi,t  =  P2)  :  p 


■K  \-  Pi  :  9  —*  O' 

TT  h  (pi,t  =  P2)  :  0  O' 


TT  h  Pi  :  (ti:0) _ 

TT  h  (pi,t  =p2)  :  (ti:0) 


when  L  ^  Li 


Next,  we  introduce  a  construction  for  defining  a  sequence  by  giving  a  list  of  its  elements: 


•  Sequences 


TT  h  po  :  0 


TT  h  pn-i  :  0 _ 

TT  h  seq(po, . . .  ,Pn-i)  :  (int  — +  0)  &  len:  int 


when  n  >  1 
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The  effect  of  this  construction  is  that,  if  e  is  an  integer  expression  with  value  k  such  that 
0  <  <  n,  then 

seq(po,---,Pn-i)e 

has  the  same  meaning  as  pk  (and  thus  can  be  used  in  the  role  of  a  case  construction). 
Moreover, 

seq(po,.-.,Pn-i)-/en 
is  an  integer  expression  with  value  n. 

Finally,  we  introduce  yet  another  form  of  definition,  to  permit  the  user  to  let  identifiers 
stand  for  types.  The  types  occurring  in  phrases  are  generalized  to  type  expressions  that  can 
contain  type  identifiers,  which  are  given  meaning  by  the  inference  rule 

•  Type  Definitions 

TT  h  (p/ti,...,i„  :  9  where  1  <  <  mi,...,n  <  in  <  rUn 

IT  h  (lettypeii  =  6li,i  I  •  •  •  I  ...  ,  I  •  •  •  I  inp)  :  0 

where  (p/ii, . ..  ttn  denotes  the  result  of  simultaneously  substituting  9x^i^, 

•  •  • )  ^n,i„  for  the  free  occurrences  (as  type  identifiers)  of  ii, . . . ,  in  type  expressions  within 
p.  (As  with  the  definitions  described  earlier,  ti,. . .,  must  be  distinct  identifiers.) 

As  a  simple  example, 

lettype  t  =  int  in  Ax:  t.  Xy:t.  x  x  y  +  2 

will  have  type  int  — >  int  — >  int.  Notice  that  this  is  a  transparent,  rather  than  opaque,  form 
of  type  definition;  e.g.  within  its  scope,  t  is  equivalent  to  int,  rather  than  being  an  abstract 
type  represented  by  integers  (which  would  make  the  above  example  ill- typed). 

Using  the  alternative  operator  in  this  construction  provides  another  way  to  define  proce¬ 
dures  with  multiple  types.  For  example, 

lettype  t  =  int  I  real  in  Ax:  t.  Xy:t.  x  x  y  +  2 

will  have  both  type  int  ^  int  ^  int  and  real  ^  real  —>■  real.  (The  use  of  the  alternative 
operator  in  type  definitions  was  suggested  by  Benjamin  Pierce  [18].) 

The  same  string  of  characters  can  be  used  as  both  an  ordinary  identifier  and  as  a  type 
identifier  without  interference.  A  change  in  its  binding  as  an  ordinary  identifier  has  no  effect 
on  its  meaning  as  a  type  identifier,  and  vice-versa. 


6.  Predefined  Identifiers 


In  place  of  various  constants,  Forsythe  provides  predefined  (ordinary)  identifiers,  which  may 
be  redefined  by  the  user,  but  which  take  on  standard  types  and  meanings  outside  of  these 
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bindings.  In  describing  these  identifiers,  we  simply  state  the  type  of  their  unbound  occur¬ 
rences,  e.g.  we  write  true:  bool  as  an  abbreviation  for  the  inference  rule 


TT  h  true  :  bool  when  true  (  domTr . 

In  the  first  place,  there  are  the  usual  boolean  constants,  a  skip  command  that  leaves  the 
state  unchanged,  and  a  standard  phrase  of  type  ns: 

frueibool  /a/se:bool 

sA:ip:comm  null:  ns  . 

(Of  course,  there  are  many  other  nonsense  phrases  —  phrases  whose  only  types  are  equivalent 
to  ns  —  which  are  all  too  easy  to  write,  but  null  is  the  only  such  phrase  that  will  not  activate 
a  warning  message  from  the  compiler.  The  point  is  that  there  are  contexts  in  which  null  is 
sensible,  for  example  as  the  denotation  of  an  object  with  no  fields.) 

The  remaining  predefined  identifiers  denote  built-in  procedures.  Four  of  these  procedures 
serve  to  declare  variables.  For  6  =  int,  real,  bool,  or  char: 

newSvar:  8 

^(^var  comm)  ^  comm 
k,  (^var  — >  compl)  ^  compl 
k  (^var  — >  int)  — >•  int 
k  (^var  real)  ^  real 
k  (^var  bool)  bool 
k  (^var  char)  char^  . 

The  application  newSvar  init  p  causes  a  new  8  variable  to  be  added  to  the  state  of  the 
computation;  this  variable  is  initialized  to  the  value  init,  then  the  procedure  p  is  applied  to 
the  variable,  and  finally  the  new  variable  is  removed  from  the  state  of  the  computation  when 
the  call  of  p  is  completed  (or  when  the  execution  of  a  nonlocal  completion  causes  control  to 
escape  from  p  so  that  the  new  variable  can  no  longer  be  assigned  or  evaluated.)  Thus 

newintvar  init  \x.  B 


is  equivalent  to  the  Algol  block 

begin  integer  x-,  x  :—  init-,  B  end  . 


The  multiplicity  of  types  of  the  new8var  procedures  permits  variables  to  be  declared 
in  completions  and  expressions  as  well  as  commands.  Within  expressions,  however,  locally 
declared  variables,  like  any  other  variables,  can  be  evaluated  but  not  assigned. 
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Four  analogous  procedures  are  provided  for  declaring  variable  sequences; 


new6varseq:  int  - 
^(^varseq  — 
& (^varseq 
&:  (^varseq 
&  (^varseq 
&  (dvarseq 
&:  (dvarseq 


comm 


^  (int  — >  6) 
comm) 

->  compl)  -4  compl 
int)  ^  int 
real)  real 
->  bool)  bool 

->  char)  char^  . 


The  application  newSvarseq  I  init  p  causes  a  new  S  variable  sequence  of  length  I  to  be 
added  to  the  state  of  the  computation;  the  elements  of  this  sequence  are  initialized  to  values 
obtained  by  applying  the  procedure  init,  and  then  the  procedure  p  is  applied  to  the  sequence. 
Thus 

newintvarseq  I  init  Xx.  B 

is  equivalent  to  the  Algol  block 


begin  integer  array  x(0  :  /  —  1); 
begin  integer  i; 

for  i  :=  0  to  /  —  1  do  a;(i)  :=  init{i) 
end; 

B 

end  . 


In  essence,  this  approach  to  the  declaration  of  variables  and  sequences  is  a  syntactic 
desugaring  of  the  conventional  form  of  declarations  into  the  application  of  a  procedure; 
procedures  such  as  newintvar  init  or  newintvarseq  I  init  that  are  intended  to  be  used  this 
way  are  called  declarators.  The  advantage  of  this  view  is  that  the  user  can  define  his  own 
declarators  or  declarator-producing  procedures.  For  example  (as  we  will  illustrate  later), 
the  user  can  define  his  own  declarators  for  any  kind  of  array  for  which  he  can  program  the 
index-mapping  function. 

Another  declarator  is  provided  for  the  declaration  of  completions  that  cause  control  to 
escape  from  a  command.  The  procedure 

escape:  (compl  — >  comm)  ^  comm 


applies  its  parameter  to  a  completion  whose  execution  causes  immediate  termination  of  the 
application  of  escape.  Thus 

escape  Xe.  C 


is  equivalent  to  the  Algol  block 


begin  C';  e:  end  , 
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where  C  is  obtained  from  C  by  substituting  goto  e  for  e. 

Facilities  for  input  and  output  are  also  provided  by  declarators.  For  output,  there  is 

newoutchannel:  charseq  — > 

((characc  ^  comm)  ^  comm 

&  (characc  ^  compl)  compl)  . 

The  application  newoutchannel  s  p  opens  the  file  named  by  s  for  output  and  applies  the 
procedure  p  to  an  output  channel  c,  which  has  type  characc.  Each  time  p  assigns  a  character 
to  c,  the  character  is  output  to  the  file.  When  the  call  of  p  is  completed  (or  when  the 
execution  of  a  nonlocal  completion  causes  control  to  escape  from  p  so  that  c  can  no  longer 
be  executed),  the  file  is  closed.  (Our  choice  of  automatic  file-closure  upon  block  exit  is  based 
on  a  belief  that,  in  an  Algol-like  language,  file  buffers  should  obey  the  same  stack  discipline 
as  variables.) 

Since  output  channels  are  character  acceptors,  one  might  expect  input  channels  to  be 
character  expressions,  but  this  would  violate  the  principle  that  expressions  must  not  have 
side  effects.  Instead,  an  input  channel  has  type  (characc  Sz  eof:  compl)  — >  comm,  so  that 
the  declarator  for  input  has  type 

newinchannel :  charseq 

^  ^((characc  &  eof:  compl)  -4  comm)  — >  comm^  — >  comm 
&  (((characc  k  eof:  compl)  ^  comm)  — >  compl)  — »  compl)  . 

The  application  newinchannel  s  p  opens  the  file  named  by  s  for  input  and  applies  the 
procedure  p  to  an  input  channel  c.  Each  time  p  executes  a  call  c(a,  eof  =  k)  the  next 
character  is  read  from  the  file  and  passed  as  an  argument  to  the  character  acceptor  a,  unless 
an  end-of-file  has  occurred,  in  which  case  the  completion  k  is  executed.  When  the  call  of  p 
is  completed  (or  when  a  nonlocal  completion  causes  an  escape  from  p),  the  file  is  closed. 

Standard  input  and  output  are  provided  by  channels  that  are  named  by  predefined  iden¬ 
tifiers: 

std-in:  (characc  &  eof:  compl)  ^  comm  std-out:  characc  . 

Some  obvious  functional  procedures  are  provided  to  convert  between  characters  and  their 
integer  codes; 


char  Jo -Code:  char  — >  int  code-to-char:  int  — »■  char  , 

and  to  convert  character  sequences  in  decimal  notation  into  integers  and  real  numbers: 

charseq-to -int:  charseq  — >  int  charseq-to -real:  charseq  ^  real . 

The  last  two  procedures  ignore  nondigits,  except  for  leading  minus  signs  and  (in  the  case  of 
charseq-to -real)  the  first  occurrence  of  a  decimal  point. 


29 


One  might  expect  the  procedure  that  converts  integers  into  their  decimal  representations 
to  have  the  type  int  charseq,  but  this  would  be  unsuitable  since  charseq  is  not  a 
datatype.  Instead,  there  is  another  declarator: 


comm 


int -to -charseq:  int 

^(charseq  — >  comm) 

&:(charseq  — >  compl)  — >  compl 
&  (charseq  int)  int 
k  (charseq  ^  real)  real 
&  (charseq  ^  bool)  — >  bool 
&  (charseq  char)  — >  char^  . 


The  application  int -to -charseq  n  p  converts  the  integer  n  to  a  character  sequence  giving  its 
decimal  representation,  and  applies  the  procedure  p  to  this  character  sequence. 


The  conversion  of  real  numbers  to  a  decimal  representation  is  considerably  more  complex, 
for  several  reasons:  One  must  deal  with  both  a  fraction  and  exponent,  the  digit-length  of  the 
fraction  must  be  specified,  and  there  is  no  universally  accepted  notation.  The  conversion  is 
implemented  by  the  declarator 


real -to .charseq:  real  — )■  int 

^(charseq  — >  int  — ^  comm)  comm 
&(charseq  — ^  int  compl)  compl 

&  (charseq  — >  int  -4  int)  -4  int 
k,  (charseq  int  real)  real 
&  (charseq  — >  int  ^  bool)  —>  bool 
&  (charseq  — >  int  -4  char)  — >  char^  . 

Let  r  be  a  positive  nonzero  real  number  and  /  x  10®  be  a  closest  approximation  to  r  such 
that  X  is  an  integer,  0.1  <  /  <  1.0,  and  /  has  a  decimal  representation  containing  d  digits 
(to  the  right  of  the  decimal  point).  Then  real-to-charseq  r  d  p  applies  the  procedure  p  to 
the  digit  sequence  representing  /  (excluding  the  decimal  point)  and  the  integer  x.  (Notice 
that  X,  as  well  as  /,  can  depend  upon  d  when  r  is  slightly  less  than  a  power  of  ten.) 

Clearly,  if  Forsythe  grows  beyond  the  experimental  stage,  it  will  be  necessary  for  the 
predefined  identifiers  to  provide  richer  capabilities  than  are  described  above.  To  do  this  in 
an  “upward  compatible”  manner,  one  can  obviously  add  new  predefined  identifiers.  But  the 
type  system  provides  another,  more  interesting  possibility:  One  can  lower  the  type  of  an 
existing  predefined  identifier  to  a  subtype  of  its  original  type,  and  give  a  new  meaning  to 
the  identifier,  providing  the  implicit  coercion  induced  by  the  subtype  relation  maps  the  new 
meaning  back  into  the  old  one. 
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For  example,  one  might  change  the  type  of  newoutchannel  to 
newoutchannel :  charseq 

(((characc  k  flush:  comm)  comm)  ^  comm 

&  ((characc  k  flush:  comm)  ^  compl)  -4  compl)  . 

As  before,  the  application  newoutchannel  s  p  would  open  the  file  s  and  apply  the  procedure 
p  to  an  output  channel  c,  and  each  time  that  p  assigned  a  character  to  c,  the  character  would 
be  output  to  the  file.  But  now  p  could  also  execute  the  command  c.  flush,  which  might  flush 
the  output  buffer. 


7.  Syntactic  Sugar 


Several  abbreviations  are  provided  to  avoid  repeating  type  information  (or  the  absence 
thereof)  when  several  identifiers  range  over  the  same  type.  In  types, 

ii,  ...  abbreviates  ti:6k  •••  kin’-O- 


In  abstractions 

Ail,  •••  ,  in'- I  •  •  •  I  P  abbreviates 

Aii:0il---Idfc.  ...  \in:0i\---\0k.p 


and 


Ail, 


In  recursive  definitions 


,in-P  abbreviates  Aii.  ...  Ai„.  p  . 


ii,  ...  ,tn:0  abbreviates  ii’.O,  ...  ,i„:0. 


In  each  of  these  cases,  the  identifiers  ii,  . ..,  i„  must  be  distinct. 
Also,  to  permit  a  more  Algol-like  appearance. 

Pi  '•=  P2  abbreviates  pip2  . 


Finally,  although  we  treated  them  as  independent  constructions  in  Section  5,  the  defini¬ 
tional  forms  can  be  regarded  as  abbreviations.  First,  recursive  definitions  can  be  desugared 
in  terms  of  nonrecursive  definitions  and  the  fixed-point  operator.  For  a  single  definition,  one 
can  give  a  straightforward  desugaring: 

letrecii:  61  where  ii  =  pi  in  p  abbreviates  let  ii  =  rec:  Oi.  Aii.  piinp  . 
However,  a  general  rule  that  includes  simultaneous  recursion  is  more  complex: 

letrecii:^!,  ...  ,  where  ti  =pi,  ...  ,t„  =  p„inp  abbreviates 

let  i  =  rec:  (ii:  ^1  k  •  •  •  &  0„).  At. 

let  ti  =:  t. <•! ,  ...  ,  Lfi  =  L.Lfi  in(ti  =  Pi ,  ...  ,  tfi  =  Pn) 
in  let  ti  =  t.ti,  ...,<■„  =  t.t„  in p , 
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where  l  is  any  identifier  not  occurring  in  the  letrec  definition. 

Second,  nonrecursive  definitions  can  be  desugared  in  terms  of  abstractions  and  applica¬ 
tions  (following  Landin): 

letti:0i=pi,  ...  ,  =  p„  inp  abbreviates  ■  ■  ■  Xtn'- On- p)pi  •  ■  ■  Pn  ■ 

This  rule  continues  to  make  sense  if,  for  some  i,  the  types  6i  are  omitted.  But  in  this  case 
the  rightside  will  contain  too  little  information  to  satisfy  the  typechecker,  even  though  the 
leftside  may  be  satisfactory. 


8.  Reduction  Rules 


As  mentioned  in  the  introduction,  an  operational  way  of  describing  Forsythe  is  to  say  that  a 
program  is  a  phrase  of  type  comm,  in  an  enriched  typed  lambda  calculus,  that  is  executed 
by  first  reducing  the  phrase  to  normal  form  (more  precisely,  to  a  possibly  infinite  or  partial 
normal  form)  and  then  executing  the  normal  form,  which  will  be  a  program  in  the  simple 
imperative  language.  Although  we  will  not  pursue  this  view  in  detail,  it  is  useful  to  list  some 
of  the  reduction  rules,  which  preserve  the  meanings  of  programs  and  thus  provide  insight 
into  their  semantics.  (We  will  ignore  types  in  these  rules,  since  they  play  no  role  in  the 
process  of  reduction.) 

First  there  is  the  lambda-calculus  rule  of  j0-reduction: 

{Xi.pi)p2  (FiA->P2) 

where  (pi/i  — >  P2)  denotes  the  result  of  substituting  p2  for  the  free  occurrences  of  l  (except 
as  a  type  identifier  or  field  name)  in  pi. 

Then  there  is  a  rule  for  selecting  fields: 

(i  =  p).t  ^  p, 


two  rules  for  conditionals: 

(if  Pi  thenp2  elsep3)p4 

(if  Pi  thenp2  elsepa).^ 
a  rule  for  nonrecursive  definitions: 

letii  =  pi,  ...  =p„inp 

and  a  rule  for  the  fixed-point  operator: 

recp 


if  Pi  then  p2  p4  else  pa  p4 

if  Pi  thenp2.ielsep3.i , 

(p/ii  ...  ,i„  -^Pi,  ...  ,p„), 

p(recp) . 
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In  addition,  there  are  a  number  of  rules  dealing  with  the  merging  operation: 

{pi,\t.p2)p3  ^  (Ai.  P2)P3 

(P1,1=P2)P3  PlP3 

{pi,\l.  p2).t'  Pi.i' 

{pi,i=P2).i  {i=P2).L 

(pi,  i  =  p2)-if  =>  Pi-i''  when  i' . 

(It  should  be  noted  that  these  rules  are  not  complete;  in  particular,  it  is  not  clear  how  to 
provide  rules  for  reducing  merges  in  contexts  that  require  primitive  types.) 

The  reduction  rules  make  it  clear  that  call  by  name  pervades  Forsythe.  For  example,  if 
Pc  is  any  phrase  that  does  not  contain  free  occurrences  of  i,  and  pi  and  p2  are  any  phrases, 
then 

(Ai.  pc)pi  Pc 

let  i  =  Pi  in  Pc  Pc 

{iX=Pl,L2=P2)-t-\  Pi 

(il  =  Pl,i2  =  P2).i2  P2 

hold  even  when  pi  or  p2  denote  nonterminating  computations. 

Morever,  call  by  name  even  characterizes  the  assignment  operation,  since  assignments 
are  abbreviations  for  procedure  calls.  For  example,  assuming  that  x  is  an  ordinary  integer 
variable  (e.g.  declared  using  newintvar), 

{Xy.  X  :=  3)  :=  p  and  (Ay.  x  :=  y  +  y)  p 

would  evaluate  the  expression  p  zero  and  two  times  respectively.  This  is  probably  the  most 
controversial  design  decision  in  Forsythe,  since  it  makes  the  language,  so  to  speak,  more 
Algol-like  than  Algol  itself.  It  may  degrade  the  efficiency  with  which  the  language  can 
be  implemented  but,  as  demonstrated  in  Sections  11  and  13,  it  leads  to  some  interesting 
programming  techniques. 


9.  Examples  of  Procedures 


In  this  and  the  next  four  sections,  we  provide  a  variety  of  examples  of  Forsythe  programs. 
Many  of  these  examples  are  translations  of  Algol  W  programs  given  in  [10],  which  the  reader 
may  wish  to  compare  with  the  present  versions. 

To  define  a  proper  procedure  that  sets  its  second  parameter  to  the  factorial  of  its  first 
parameter,  we  define  fact  to  be  the  obvious  command,  abstracted  on  an  integer  expression 
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n  and  an  integer  variable  /: 

let  fact:  int  intvar  ^  comm  =  Xn.  A/. 

newintvar  0  Xk. 

^/  :=  1  ;  while  k  ^  n  do  {k  :=  k  +  1  •,  f  :=  k  x  ffj 

Here  we  have  specified  the  necessary  types  in  the  nonrecursive  definition,  but  instead  we 
could  have  specified  them  in  the  abstractions: 

let  fact  =  Xn:  int.  A/:  intvar. 
newintvar  0  Xk. 

^/  :=  1  ;  while  k  ^  n  do  (k  :=  k  +  I  -.i  f  :=  k  x  f)J 

(Although  Forsythe  supports  either  method  of  specifying  the  types  of  nonrecursive  proce¬ 
dures,  in  these  examples  we  will  usually  give  types  in  definitions  rather  than  in  abstractions, 
since  this  approach  is  more  readable,  and  in  some  cases  gives  more  efficient  typechecking.) 

This  procedure  has  the  usual  shortcoming  of  call  by  name:  It  will  repeatedly  evaluate  the 
expression  n.  To  remedy  this  defect,  we  replace  n  by  a  local  variable  (also  called  n)  that  is 
initialized  to  the  input  parameter  n.  Notice  that  this  is  equivalent  to  the  definition  of  call 
by  value  in  Algol  60. 

let  fact:  int  intvar  ^  comm  =  An.  A/. 

newintvar  n  Xn. 
newintvar  0  Xk. 

(/  :=  1  ;  while  k  ^  n  do  {k  :=  k  +  1  ;  f  :=  k  x  ffj 

We  can  also  modify  this  procedure  to  obtain  the  effect  of  calling  /  by  result  (as  in 
Algol  W  [4]).  We  replace  /  by  a  local  variable,  and  then  assign  the  final  value  of  this  local 
variable  to  the  parameter  /,  which  now  has  type  intacc,  since  it  is  never  evaluated  by  the 
procedure. 

let  fact:  int  intacc  comm  =  An.  A/. 

newintvar  n  Xn.  newintvar  1  Xlocalf . 

(newintvar  0  Xk. 

while  k  ^  n  do  (k  :=  k  +  1  ]  localf  :=  k  x  localf)] 
f  :=  localf^ 

This  transformation  is  sufficiently  complex  that  it  is  worthwhile  to  encapsulate  it  as  a  pro¬ 
cedure.  We  define 

letinline  newintvarres:  int  — >  intacc  (intvar  comm)  — >  comm  = 

Xinit.  Xfin.  Xb.  newintvar  init  Xlocal.  {b  local  \fin  :=  local) 
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Then  to  call  /  by  result  we  define 


let  fact:  int  — intacc  ^  comm  =  An.  A/. 
newintvar  n  An.  newintvarres  1  /  A/. 
newintvar  0  Xk. 

while  k  ^  n  do  {k  :=  k  +  1  ;  f  :=  k  X  f) 

When  placed  within  the  scope  of  the  definition  of  newintvarres,  this  definition  of  fact  reduces 
to  to  the  previous  one  (except  for  the  names  of  bound  identifiers)  and  therefore  has  the  same 
meaning.  Moreover,  since  newintvarres  is  defined  by  letinline,  the  two  definitions  of  fact 
will  compile  into  the  same  machine  code. 

We  can  also  define  the  traditional  recursive  function  procedure  for  computing  the  fac¬ 
torial.  Here  again  we  call  n  by  value,  illustrating  the  use  of  newintvar  within  an  expres¬ 
sion. 


letrec  fact:  int  int 

where  fact  —  An.  newintvar  n  An. 

if  n  =  0  then  1  else  n  x  fact{n  —  1) 

Next,  we  give  some  examples  of  procedures  that  take  advantage  of  call  by  name.  In 
the  following  function  procedure  for  integer  multiplication,  call  by  name  is  used  to  provide 
“short-circuit”  evaluation, 

letinline  multiply:  int  — >  int  — >  int  =  Am.  An.  if  m  =  0  then  0  else  m  x  n 

i.e.  n  will  not  be  evaluated  when  m  is  zero.  In  a  proper  procedure  akin  to  the  Pascal  repeat 
command, 

letinline  repeat:  comm  bool  comm  =  Ac.  Xb.  (c  ;  while  ~  fe  do  c) 

b  must  be  called  by  name  to  permit  its  repeated  evaluation.  (Both  multiply  and  repeat  are 
such  simple  procedures  that  it  is  obviously  worthwhile  to  compile  them  inline.) 

Repeated  evaluation  is  also  crucial  to  the  following  program,  where  the  call  of  the  proce¬ 
dure  sum  sets  s  to  Y!i_^X{i)  x  Y{i)  by  repeatedly  evaluating  X{i)  x  Y{i)  while  increasing 
the  variable  i: 

letinline  sum:  intvar  ^  int  — >  comm  =  Xi.  Ac. 
begin  a  :=  0  ;  z  :=  a  —  1; 
while  i  <  6  do  (i  :=  i  -|-  1  ;  a  :=  s  -f  e) 
end 

in  sum  i  {x{i)  X  y(i)) 

This  way  of  using  call  by  name,  known  as  “Jensen’s  device”,  was  illustrated  in  the  original 
Algol  60  Report  [2,  3]  by  the  exemplary  procedure  Innerproduct. 
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Finally,  we  give  two  higher-order  procedures  akin  to  the  for  command: 

letinline /or:  int  ^  int  ^  (int  — comm)  comm  =  XL  Xu.  Xb. 
newintvar{l  —  1)  Xk.  newintvar  u  Xu. 
while  k  <  u  do  {k  :=  k  +  1 ;  b  k), 
fordown:  int  int  — ^  (int  — >  comm)  — >  comm  =  XL  Xu.  Xb. 
newintvar{u  -|- 1)  Xk.  newintvar  I  XL 
while  k  >  I  do  (k  :=  k  —  I  ;b  k) 
in  for  0  9  Xi.  s  :=  s  +  X{i)  X  Y(i) 

Notice  that,  in  these  procedures,  since  the  procedure  b  takes  a  parameter  of  type  int,  the 
application  b  k  cannot  change  the  value  of  k.  Moreover,  although  this  application  can  change 
the  values  of  the  parameters  I  and  u,  the  interval  iterated  over  is  always  determined  by  the 
initial  values  of  these  parameters. 

Even  though  the  procedures  sum,  for,  and  fordown  are  moderately  complex,  we  have  used 
letinline  to  define  them,  since  they  evaluate  some  of  their  parameters  repeatedly.  When 
a  procedure  is  defined  by  letinline,  not  only  are  its  calls  compiled  into  inline  code,  but 
also  the  execution  or  call  of  the  parameters  of  the  procedure.  In  particular,  the  expressions 
i  and  X{i)  x  F(e)  in  the  call  of  sum  will  be  executed  inline,  and  the  procedure  Xi.  s  := 
s  +  ^(f)  X  Y{i)  in  the  call  of  for  will  be  called  inline. 


10.  Escapes  and  Completions 

The  procedure  escape  declares  a  completion  whose  execution  causes  an  exit  from  the  call 
of  escape.  A  simple  example  of  its  use  is  the  following  procedure  for  searching  an  integer 
function  X  (which  might  be  an  integer  sequence  or  array)  over  the  interval  I  to  u  for  a  value 
that  is  equal  to  y.  If  such  a  value  is  found,  the  procedure  sets  present  to  true  and  j  to  the 
argument  for  which  X{j)  =  y,  otherwise  it  sets  present  to  false. 

let  linsearch:  (int  — >  int)  int  — >  int  — >  int  boolacc  intacc  — >  comm  = 

AX.  XL  Xu.  Xy.  Xpresent.  Xj. 
escape  Xout. 

(jor  I  u  Xk.  if  X{k)  =  y  then  {present  :=  true  ]  j  :=  k  ■,  out)  else  skip-, 
present  :=  false^ 

An  alternative  version  of  this  procedure  branches  to  one  of  two  parameters  depending  upon 
whether  the  search  succeeds.  If  the  search  fails,  it  goes  to  the  completion  failure]  if  the 
search  succeeds,  it  goes  to  the  completion  procedure  success,  passing  it  the  integer  k  such 
that  X{k)  =  y. 
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let  linsearch: 

(int  — >  int)  —>■  int  int  — >  int  — >  (int  — >  compl)  ^  compl  — >  compl  = 
AX.  XL  Xu.  Xy.  Xsuccess.  Xfailure. 

(jor  I  u  Xk.  if  X{k)  =  y  then  success  k  else  skip  ;  failure^ 


This  illustrates  that,  if  a  procedure  always  terminates  by  executing  a  completion,  then  a 
call  of  the  procedure  will  itself  be  a  completion.  This  fact  is  relevant  when  the  last  action 
of  a  procedure  is  to  assign  to  a  parameter,  since  an  assignment  to  a  parameter  is  syntactic 
sugar  for  a  call  of  the  parameter.  For  example,  the  following  is  an  alternative  typing  of  the 
procedure  newintvarres  introduced  in  the  previous  section: 

letinline  newintvarres-.  int  — »■  (int  — compl)  ^  (intvar  — >  comm)  — >  compl  = 
Xinit.  Xfin.  Xb.  newintvar  init  Xlocal.  (6  local  ^fin  :=  local) 

This  makes  sense  because  the  final  assignment  fin  :=  locals  really  means  fin  local,  which 
will  be  a  completion  when  fin  has  type  int  ^  compl.  One  can  even  use  intersection 
to  give  newintvarres  both  its  conventional  type  and  this  variant  in  a  single  definition: 


letinline  newintvarres:  int  — >  (intacc  —>■  (intvar  — comm)  — »■  comm 

k  (int  compl)  — >  (intvar  — >  comm)  ^  compl)  = 

Xinit.  Xfin.  Xb.  newintvar  init  Xlocal.  {b  local  ■,fin  local) 

In  addition  to  using  escape,  one  can  define  completions  recursively,  to  obtain  the  equiva¬ 
lent  of  conventional  labels.  For  example,  the  following  procedure  sets  y  to  x"  (in  time  log  n), 
without  doing  unnecessary  tests: 

let  power:  int  — >  int  — >  intacc  — >  comm  =  Ax.  An.  Xy. 
newintvar  n  Xk.  newintvarres  1  y  Xy.  newintvar  x  Xz. 
escape  Xzr  {k  =  0}. 

letrec  tr,  nz,  ev,  od,  nzev:  compl 
where 

tr  {true}  =  if  A:  =  0  then  zr  else  nz, 

nz  {A:  7^  0}  =  if  A:rem2  ^  0  then  od  else  nzev, 

ev  {even  k)  =  if  k  =  0  then  zr  else  nzev, 

od  {odd  k}  =  k  :=  k  —  1  ■  y  :=  y  X  z  -,  ev, 

nzev  {k  ^  0  A  even  k}  =  k  :=  k  2  -,  z  :=  z  x  z  -,  nz 

in  tr 
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The  invariant  of  this  program,  i.e.  the  assertion  that  holds  whenever  any  completion  is 
executed, is 

yxz’^  =  x'^Ak>0. 

The  additional  assertions  given  as  comments  at  the  binding  of  each  completion  hold  whenever 
the  corresponding  completion  is  executed. 

Notice  that,  in  contrast  to  labels,  one  can  never  execute  a  completion  by  “passing  through” 
to  its  definition.  Indeed,  the  meaning  of  the  above  program  is  independent  of  the  order  of 
the  definitions  of  completions. 


11.  Sequences  and  Arrays 


As  we  remarked  in  Section  6,  using  the  built-in  procedures  for  declaring  variable  sequences 
the  programmer  can  define  his  own  procedures  for  declaring  more  complex  kinds  of  arrays. 
For  example,  suppose  we  want  Algol-like  one-dimensional  integer  arrays  with  arbitrary  lower 
and  upper  bounds  (denoted  by  the  field  names  //  and  ul).  First  we  define  abbreviations  for 
the  relevant  types: 

lettype  intarray  =  (int  — >  int  &  U,  ul:  int), 
intaccarray  =  (int  — intacc  &  //,  ul:  int), 
intvararray  =  (int  — »■  intvar  &:  //,  ul:  int) 


Then  the  following  procedure  serves  to  declare  integer  variable  arrays,  which  are  defined  in 
terms  of  integer  sequences: 


letinline  newintvararray:  int  int  — >  (int  —>■  int) 

^(intvararray  comm)  — >  comm  &  (intvararray  — >  compl)  — >  compl 
&;  (intvararray  ^  int)  int  &  (intvararray  real)  real 
&;  (intvararray  —>■  bool)  — >  bool&;  (intvararray  —>■  char)  char)  = 

XL  Xu.  Xinit.  Xb.  newintvar  I  XL  newintvar  u  Xu. 
newintvarseq{u  —  I  +  1)^A/;.  init{k  -|-  l)jXX . 

b(^Xk.X{k-l),ll  =  l,  ul  =  u) 


The  sixfold  intersection  that  is  the  type  of  newintvararray  is  similar  to  the  type  of 
built-in  declarators  such  as  newintvarseq.  Equally  well,  one  could  provide  the  necessary 
type  information  in  the  abstractions  rather  than  the  definition,  using  the  alternative  opera¬ 
tor: 


letinline  newintvararray  =  A/,u:int.  XiniUint  int. 

Xb:  intvararray  ^  comm  I  intvararray  — >  compl  I  intvararray  ^  int  I 
intvararray  — >  real  I  intvararray  — >  bool  I  intvararray  char. 
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newintvar  I  XL  newintvar  u  Xu. 

newintvarseq{u  —  I  +  1)^AA:.  init{k  +  ifjXX. 

b[Xk.X{k-l),ll  =  l,  ul  =  u) 

We  can  also  define  a  procedure  slice  that,  given  an  array  and  two  integers,  yields  a 
subsegment  of  the  array  with  new  bounds.  The  simplest  definition  is 

letinline  slice:  ^(int  int)  ^  int  ^  int  intarray 
&:  (int  intacc)  ^  int  — >  int  intaccarrayj  = 

XX.  XL  Xu.  {X,  11  =  I,  ul  =  u) 

A  safer  alternative,  which  checks  applications  of  the  array  against  the  new  bounds,  is 


letinline  slicecheck:  ((int  — >  int)  — >  int  int  =>■  intarray 
&:  (int  — >  intacc)  — >  int  — >  int  =>■  intaccarray^  = 

AX.  XL  Xu.  {Xk.  ii  I  <  k  Ak  <  u  then  X  k  else  error  ‘subscript  error’, 
ll  =  l,ul  =  u) 


The  type  of  slice  and  slicecheck  is  extremely  general: 

•  If  the  first  argument  has  type  int  int,  or  any  subtype,  such  as  intseq  or  intarray, 
then  the  call  will  have  type  intarray. 

•  If  the  first  argument  has  type  int  =>■  intacc,  or  any  subtype,  such  as  intaccseq  or 
intaccarray,  then  the  call  will  have  type  intaccarray. 

•  If  the  first  argument  has  type  int  =>  intvar,  or  any  subtype,  such  as  intvarseq  or 
intvararray,  then  both  of  the  previous  cases  will  apply,  and  the  call  will  have  type 
intarray  k,  intaccarray,  which  is  equivalent  to  intvararray. 

The  use  of  these  procedures  is  illustrated  by  a  program  for  sorting  by  finding  max¬ 
ima.  First  we  define  a  procedure  that  sets  j  to  the  subscript  of  a  maximum  of  an  array 
X: 

letinline  max:  intarray  — >  intacc  — comm  =  AX.  Xj. 

newintvar  X.ll  Xa.  newintvar  X.ul  Xb. 
newintvarres  aj  Xj. 

while  a  <  6  do  (a  :=  a  -1- 1 ;  if  X  a  >  X  j  then  j  :=  a  else  skip) 
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(Here  giving  X  the  type  intarray  rather  than  intvararray  indicates  that  max  will  ex¬ 
amine  X  but  not  assign  to  it.)  Next  comes  a  procedure  for  exchanging  a  pair  of  array 
elements: 

letinline  exchange:  (int  — >  intvar)  int  ^  int  — >  comm  =  \X.  Xi.  Xj. 

newintvar  i  Xi.  newintvar  j  Xj. 

newintvar{X  i)  Xt.  {X  i  :=  X  j  ■,  X  j  :=  t) 

(Giving  X  the  type  int  intvar  indicates  that  exchange  does  not  evaluate  bounds; 
e.g.  it  would  also  be  applicable  to  an  integer  variable  sequence.)  Then  the  sort  procedure 
is: 

let  maxsort:  intvararray  ^  comm  =  XX. 
newintvar  X.ll  Xa.  newintvar  X.ul  Xb. 
while  a  <  6  do  newintvar  0  Xj. 

(max(slice  X  a  b)  j  ]  exchange  X  j  b ;  b  :=  b  — 


The  above  procedure  contains  a  spurious  initialization  of  the  variable  j.  The  purpose 
of  this  variable  is  to  accept  the  result  of  max,  but  newintvar  requires  us  to  initialize  it 
before  calling  max.  However,  this  unpleasantness  can  be  avoided  by  taking  advantage  of  the 
fact  that  assignments  are  really  applications.  By  substituting  the  definition  of  newintvarres 
into  the  definition  of  max  and  reducing,  we  find  that  the  definition  of  max  is  equivalent 
to 


letinline  max:  intarray  intacc  — ^  comm  =  XX.  Xj. 

newintvar  X.ll  Xa.  newintvar  X.ul  Xb. 
newintvar  a  Xlocal. 

^while  a  <  6  do 

(a  :=  a  +  1  ;  if  X  a  >  X  local  then  local  :=  a  else  skip)-, 


:—  locafj 


where  j  :=  local  is  syntactic  sugar  for  j  local.  Thus  the  second  parameter  of  max  can  be 
any  procedure  of  type  int  — >  comm,  i.e.  any  proper  procedure  accepting  an  integer;  the 
effect  of  max  will  be  to  apply  this  procedure  to  the  subscript  of  the  maximum  of  X. 

To  avoid  the  spurious  initialization,  we  make  this  parameter  a  procedure  that  carries  out 
the  appropriate  exchange,  dispensing  with  the  variable  j  entirely: 


let  maxsort:  intvararray  comm  =  XX . 
newintvar  X.ll  Xa.  newintvar  X.ul  Xb. 
while  a  <  6  do 

(max (slice  X  a  b)  Xj.  exchange  X  j  b-,b:= 


40 


letinline  partition:  intvararray  ^  int  ^  intacc  — comm  =  AX.  Ar.  Xp. 
newintvar  X.ll  Ac.  newintvar  X.ul  \d.  newintvar  r  Ar. 

^while  c<  d  do 

if  X  c  <  r  then  c  :=  c  +  1  else 
if  X  d  >  r  then  d:—  d  —  \  else 

[exchange  X  c  d  \  c:—  c-\- ^  d  :=  d  —  1)\ 

pc) 

in 

letrec  quicksort:  intvararray  — >  comm 
where  quicksort  =  AX. 

newintvar  X.ll  Xa.  newintvar  X.ul  Xb. 
if  a  <  b  then 


if  X  a  >  X  6  then  exchange  X  ah  else  skip\ 
partition (^slice  X  (a  +  1)  (6  —  1)^  ^(X  a  +  X  b)  -7-2)  Ac. 


(^quicksort  [slice  X  a  [c  —  1)) ;  quicksort  [slice  X  c  b)) 


else  skip 


As  a  further  example  of  the  power  of  declarators,  we  can  define  the  type  of  triangular 
arrays  of  real  variables,  along  with  an  appropriate  declarator  (which,  to  keep  the  example 
simple,  initializes  the  array  elements  to  zero): 


lettype  trivararray  =  ^(int  int  — >  realvar)  &  size:  intj  in 
letinline  newtrivararray:  int 

^(trivararray  — >  comm)  comm  &  (trivararray  compl)  — >  compl 

k,  (trivararray  — >  int)  ^  int  &  (trivararray  — >  real)  — >  real 
k  (trivararray  — >  bool)  — >  bool  k  (trivararray  char)  — >  char^  = 


An.  Xb.  newintvar  n  An.  newrealvarseq{(^n  x  (n  +  1))  -i-  2)[Xk.  0)  AX. 


b[Xi.  Xj. 


if  0  <  j  A  j  <  i  Ai  <  n  then  X  (^[i  x  (z  +  1))  2  +  j) 

else  error  ‘subscript  error’, 
size  =  n')  . 


41 


12.  Input  and  Output 


The  following  program  illustrates  the  facilities  for  input  and  output.  It  reads  a  sequence  of 
pairs  of  nonnegative  integers  from  a  file  named  ‘infile’  and  writes  the  real  quotient  of  each 
pair  in  floating-point  notation  to  a  file  named  ‘outfile’.  It  is  assumed  that  the  integers  in  the 
input  file  are  separated  by  sequences  of  one  or  more  nondigits;  if  there  are  an  odd  number 
integers,  the  last  is  simply  converted  from  integer  to  real. 


Each  number  in  the  output  is  printed  on  a  separate  line,  as  a  six-digit  number  greater  or 
equal  to  one  and  less  than  ten,  times  a  power  of  ten.  A  result  of  zero  is  printed  as  0.0,  while 
division  by  zero  gives  an  error  message. 


newinchannel  ‘infile’  \ic.  newoutchannel  ‘outfile’  Aoc. 
letinline  is-digit:  char  — >  bool  =  Ac.  newcharvar  c  Ac.  ^0  <  c  A  c  <  ^9, 
writecharseq:  charseq  — >  comm  =  As.  newintvar  0  Xi. 
while  i  <  s.len  do  (oc  :=  s  i  ■,  i  :=  i  +  1) 


m 


letinline  writereal:  real  ^  comm  =  Ar. 

if  r  =  0  then  writecharseq  ‘0.0’  else 
real -to -charseq  r  6  As.  Aa;. 

^oc  :=  s  0  ;  oc  :=  #. ;  writecharseq{Xi.  s{i  +  1),  len  =  5)  ; 

writecharseq  ‘*10**’ ;  int -to -charseq  (x  —  1)  writecharseq^ , 
readint:  compl  intacc  ^  comm  =  Xnonumber.  Aa. 
newcharvarseq  30  (AA;.  #0)  As.  newintvar  0  Xi. 

(^repeat  {ic{s  i,  eof  =  nonumber))  {is-digit{s  i))  ; 
escape  Ac. 

repeat  (i  :=  i  -|- 1  ;  ic{s  i,  eof  =  e))  (~  is-digit{s  i))  ; 
a  :=  charseq-to-int{s,len  =  ^)j 


in 

escape  Xdone.  loop 
readint  done  Am. 

readint  (writereal  m  ]  oc  #\n  ;  done)  An. 

if  n  =  0  then  writecharseq  ‘division  by  zero\n’  else 
(writereal (min)  ;  oc  :=  #\n) . 


Here  repeat  refers  to  the  procedure  defined  in  Section  9,  while  #0,  #9,  #.,  and  #\n 
are  character  constants  denoting  the  digits  0  and  9,  the  decimal  point,  and  the  new- line 
character. 

The  procedure  readint  uses  a  local  character  variable  sequence  to  store  the  digit  sequence 
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being  read,  so  that  the  number  of  digits  is  limited  (to  29  in  this  example)  by  the  length  of  the 
sequence.  In  fact,  it  is  possible  to  avoid  this  limitation  by  programming  readint  recursively 
and  using  character  sequences  that  are  procedural  functions  rather  than  values  of  a  variable 
sequence: 


letrec  readint:  compl  — >  intacc  — >  comm,  readint  1 :  charseq  —>■  intacc  comm 

where 

readint  =  Xnonumber.  Xa. 
ic(^Xc.  newcharvar  c  Ac. 

if  is-digit  c  then  readintl  (A*,  c,  len  =  1)  a  else 
readint  nonumber  a, 
eof  =  nonumber^ , 

readintl  =  As.  Xa.  newintvar  s.len  XL  escape  Ac. 
ic(^Xc.  newcharvar  c  Ac. 

if  is-digit  c  then  readintl  {Xi.  if  i  =  /  then  c  else  s  i,len  =  I  +  1)  a  else 
a  :=  charseq -to  Jnt  s, 
eof  =  a  charseq  Jo  Jnt  5  ;  . 


Here  readintl  s  a  reads  digits  until  encountering  a  nondigit,  appends  these  digits  on 
the  right  of  the  sequence  s,  converts  the  resulting  sequence  into  an  integer,  and  assigns  the 
integer  to  a. 

Unfortunately,  however,  this  version  of  readint  is  neither  perspicuous  nor  efficient  —  and 
is  not  recommended  as  good  programming  style. 


13.  Data  Abstraction  with  Objects 


Perhaps  the  most  important  way  in  which  Forsythe  is  more  general  than  Algol  is  in  its 
provision  of  objects,  which  are  a  powerful  tool  for  data  abstraction.  One  can  write  abstract 
programs  in  which  various  kinds  of  data  are  realized  by  types  of  objects,  and  then  encapsulate 
the  representation  of  the  data,  and  the  expression  of  primitive  operations  in  terms  of  this 
representation,  in  declarators  for  the  objects. 

To  illustrate  this  style  of  programming,  we  will  develop  a  program  for  computing  reacha¬ 
bility  in  a  finite  directed  graph.  Specifically,  we  will  define  a  procedure  reachable  that,  given 
a  node  x  and  a  graph  will  compute  the  set  of  nodes  that  can  be  reached  from  x. 

Throughout  most  of  this  development  we  will  assume  that  “node”  is  a  new  data  type; 
eventually  we  will  see  how  this  assumption  can  be  eliminated.  Given  node,  we  can  define 
a  “set”  to  be  an  object  denoting  a  finite  set  of  nodes,  whose  fields  (called  methods  in 
the  jargon  of  object-oriented  programming)  are  procedures  for  manipulating  the  denoted 
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set: 


lettype  set  = 

bool 

comm 

&  iter:  (node  — >  comm)  — >  comm 
&  pick:  comm  — >  (node  comm) 


[member:  node 
&  insertnew:  node 


commj 


The  intention  is  that,  if  s  is  a  set,  a;  is  a  node,  d  is  a  procedure  of  type  node  comm,  and 
e  is  a  command,  then: 


•  s.  member  x  gives  true  if  and  only  if  a;  e  s. 

•  s.  insertnew  x  inserts  x  into  s,  providing  x  is  not  already  in  s. 

•  s.  iter  d  applies  d  to  each  member  of  s. 

•  If  s  is  empty  then  s.pick  ed  executes  e;  otherwise  s.pick  ed  removes  an  arbitrary 
member  from  s  and  applies  d  to  the  removed  member. 


In  terms  of  set,  we  can  give  a  naive  version  of  the  reachability  procedure.  The  procedure 
maintains  a  set  t  of  all  nodes  that  have  been  found  to  be  reachable  from  x,  and  a  set  u  of 
those  members  of  t  whose  immediate  successors  have  yet  to  be  added  to  t.  (An  immediate 
successor  of  a  node  y  is  a.  node  that  can  be  reached  from  y  in  one  step.)  Thus  its  invariant 
is 

xetAuCtA  (Vy  e  t)  y  is  reachable  from  x  A  (V?/  et  —  u)gyCt^ 

where  g  is  a  function  of  type  node  — >  set  such  that  gy  is  the  set  of  immediate  successors  of 
y.  This  invariant  implies  that,  when  u  is  empty,  t  is  the  set  of  all  nodes  reachable  from  x. 

In  writing  reachable^  we  assume  that  the  parameter  g  is  the  immediate-successor  function 
of  the  graph,  and  that  the  result  is  to  be  communicated  by  applying  a  procedural  parameter 
p  to  the  final  value  of  t: 

let  reachable:  node  — >  (node  — >  set)  (set  — >  comm)  — >  comm  = 

Ax.  Xg.  Xp.  newset  Xt.  newset  Xu. 

(t. insertnew  x  ;  u. insertnew  x  ; 

escape  Xout.  loop  u.pick  out  Xy.  [gy).iter  Xz. 

if  i^t. member  z  then  [t.insertnew  z  ;  u. insertnew  z) 
else  skip  ; 

pt) 

Here  newset  is  a  declarator  that  creates  an  object  of  type  set,  initialized  to  the  empty  set. 
Thus 

newset  :  (set  — >  comm)  — >  comm . 
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Actually,  we  could  give  newset  the  more  general  type 

(set  — >  comm)  — >  comm  &  (set  — >  compl)  — >  compl , 

which  would  allow  reachable  to  have  the  more  general  type 

node  ^  (node  — >  set)  — >  ^(set  — >  comm)  -4  comm  &  (set  -4^  compl)  ^  compl)  . 

This  generality,  however,  is  unnecessary  for  our  example  and  would  distract  from  our  argu¬ 
ment.  Thus,  in  this  section,  we  will  limit  our  declarators  to  the  case  where  their  calls  are 
commands. 


Next,  we  refine  the  reachability  procedure  to  provide  greater  flexibility  for  the  represen¬ 
tation  of  sets.  In  place  of  the  object  type  set,  we  introduce  different  object  types  for  the 
different  sets  used  in  the  program: 


•  setg  for  the  sets  produced  by  applying  g, 

•  sett  for  the  set  t, 

•  setu  for  the  set  u. 


The  basic  idea  is  to  limit  the  fields  of  each  of  these  object  types  to  those  procedures  that 
are  actually  needed  by  our  program.  However,  even  greater  flexibility  is  gained  by  taking 
advantage  of  the  fact  that  the  sets  t  and  u  are  declared  at  the  same  time,  and  that  u  is 
always  a  subset  of  t.  For  this  purpose,  we  introduce  a  “double  declarator”, 

newdoubleset  :  (sett  — >•  setu  -4  comm)  — >  comm 

such  that  newdoubleset  Xt:  sett.  Xu:  setu.  C  executes  C  after  binding  both  t  and  u  to  new 
(initially  empty)  sets.  Morever,  to  enforce  the  invariant  u  Ct,we  will  eliminate  the  operation 
t.  insertnew  and  redefine  u.  insertnew  to  insert  its  argument  (which  must  not  already  belong 
to  i)  into  both  u  and  t. 


Thus  we  have 

lettype  setg  =  {iter:  (node  — >  comm)  comm), 

sett  =  {member:  node  — >  bool  &  iter:  (node  — >  comm) 
setu  =  {insertnew:  node  comm 

Sz  pick:  comm  — >  (node  — >  comm)  — >  comm) 


commj , 


in 

let  reachable:  node  — >  (node  — >  setg)  -4  (sett  — >  comm)  — >  comm  = 
Xx.  Xg.  Xp.  newdoubleset  Xt.  Xu. 

{u. insertnew  x  ; 

escape  Xout.  loop  u.pick  out  Xy.  {gy).iter  Xz. 

if  ^t. member  z  then  u. insertnew  z  else  skip  ; 

Pt) 
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Notice  that  we  have  retained  the  iter  field  for  objects  of  type  sett,  even  though  this  procedure 
is  never  used  in  our  program.  The  reason  is  that  the  result  of  reachable  is  an  object  of  type 
sett,  for  which  the  user  of  reachable  may  need  an  iteration  procedure. 

Now  we  define  the  representation  of  t  and  u  by  programming  newdoubleset .  Within  this 
declarator,  we  represent  t  by  a  characteristic  vector  c,  which  is  a  boolean  variable  array  that 
is  indexed  by  nodes,  i.e.  a  procedure  of  type  node  boolvar,  such  that 

I  =  {y  \y'-  node  hey  =  true  }  . 

We  also  represent  both  t  and  u  by  a  node  variable  sequence  w  that  (with  the  help  of 
two  integer  variables  a  and  b)  enumerates  the  members  of  these  sets  without  duplication. 
Specifically, 

t  =  {w  k  \  0  <  k  <  b]  , 
u  —  {w  k  \  a  <  k  <  b}  . 


Thus  we  have 

letinline  newdoubleset:  (sett  — >  setu  — >  comm)  — >  comm  =  \p. 

newboolvarnodearray(Xn.  false)  Ac.  newnodevarseq  N  {\k.  dummynode)  Xw. 
newintvar  0  Aa.  newintvar  0  Xb. 
pimember  =  c, 

iter  =  Xd.  for  0  (6  —  1)  Xk.  d{w  k)J 

(insertnew  =  An.  (cn  :=  true  •,  w  b  n  ;  b  :=  b  +  1), 

pick  =  Xe.  Xd.  if  a  >  6  then  e  else  {d{w  a)  ;  a  :=  a  +  l)j 

Here  N  is  an  integer  expression  giving  an  upper  bound  on  the  number  of  nodes,  and 
dummynode  is  an  arbitrary  entity  of  type  node  used  to  give  a  spurious  initialization  to 

w. 


Next,  we  consider  the  representation  of  graphs.  As  far  as  reachable  is  concerned,  a  graph  is 
simply  its  immediate-successor  function,  of  type  node  setg.  But  the  part  of  the  program 
that  creates  graphs  must  have  some  primitive  procedure  for  graph  construction.  Thus  we 
make  graph  an  object  type  with  a  field  named  addedge,  denoting  a  procedure  that,  given 
its  source  and  destination  nodes,  adds  an  edge  to  the  graph; 

lettype  graph  =  (node  setg  h  addedgeinode  — >  node  -4  comm) 

Notice  that  the  immediate-successor  function  is  a  “nameless”  field  of  a  graph,  so  that  a 
graph  can  be  passed  directly  to  reachable. 

We  choose  to  represent  a  graph  by  an  integer  variable  array  succlist,  indexed  by  nodes, 
such  that  succlist  n  is  a  list  of  the  immediate  successors  of  n.  The  lists  are  represented  by 
a  node  variable  sequence  car  and  an  integer  variable  sequence  cdr.  The  integer  variable  k 
gives  the  number  of  active  list  cells.  The  empty  list  is  represented  by  -1. 
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Thus  the  declarator  for  graphs  is: 

letinline  newgraph:  (graph  — >  comm)  comm  =  \p. 
newintvarno dearray {\n.  —  1)  Xsucclist. 
newnodevarseq  E  {Xk.  dummynode)  Xcar. 
newintvarseq  E  (Xk.  —  1)  Xcdr.  newintvar  0  Xk. 
p(^Xn.  {iter  =  Xd.  newintvar (succlist  n)  XL 

while  I  ^  —1  do  {d(car  1)  -  I  :=  cdr  l)j, 
addedge  =  Am.  An. 

{car  k  n  ]  cdr  k  succlist  m  ;  succlist  m  k  ■,  k  :=  k  +  1)^ 

Here  E  is  an  upper  bound  on  the  number  of  edges  in  the  graph. 

Next,  we  consider  extending  our  program  so  that,  in  addition  to  determining  the  set  of 
nodes  that  can  be  reached  from  x,  it  computes  paths  from  x  to  each  of  these  nodes.  We 
will  alter  reachable  so  that  it  gives  its  parameter  p  an  additional  argument  r  of  type  paths, 
where  an  object  of  type  paths  provides  two  procedures  for  iterating  over  paths  in  forward 
and  backward  directions: 

lettype  paths  =  {forward,  backward:  node  — >  (node  ^  comm)  — >  comm^ 

If  r  is  an  object  of  type  paths,  j/  is  a  node  reachable  from  x,  and  d  is  a  procedure  of  type 
node  —>■  comm,  then  r.  forward  y  d  or  r.  backward  y  d  will  apply  d  to  each  node  on  the 
path  from  x  to  y. 

Within  reachable,  each  time  an  immediate  successor  2:  of  y  is  inserted  in  t  and  u,  the  path 
to  2;  formed  by  adding  2  to  the  already  known  path  to  y  will  be  recorded  in  r.  Thus,  within 
reachable,  r  will  have  the  type 

lettype  pathsvar  =  (paths  &  record:  node  ^  node  comm) 

where  r.  record  is  a  procedure  such  that  r.  record  z  y  records  the  path  to  2  formed  by  adding 
2  to  the  path  to  y.  (In  choosing  the  name  pathsvar  we  are  stretching  the  meaning  of  “var”. 
Although  an  object  of  type  pathsvar  cannot  be  assigned  to,  in  the  conventional  sense,  it 
still  consists  of  an  object  of  type  paths  intersected  with  an  operation  that  changes  the  state 
of  the  object.) 

The  new  version  of  reachable  is: 

let  reachable:  node  (node  ^  setg)  ^  (sett  — >  paths  —>■  comm)  ^  comm  = 

Xx.  Xg.  Xp.  newdoubleset  Xt.  Xu.  newpathsvar  x  Xr. 

{u.insertnew  x  ; 

escape  Xout.  loop  u.pick  out  Xy.  {gy).iter  Xz. 

if  ^t. member  2  then  u.insertnew  z  ;  r. record  2  y  else  skip  ; 
pt  r^ 
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The  representation  of  paths  is  defined  within  the  declarator  newpathsvar.  A  node  variable 
array  link,  indexed  by  nodes,  is  used  to  record  the  calls  of  record,  so  that  link  z  =  y  holds 
after  a  call  of  r.  record  z  y.  Then  forward  scans  link  recursively,  while  backward  scans  link 
iteratively: 

letinline  newpathsvar:  node  (pathsvar  — >  comm)  — >  comm  =  Xx.  Xp. 

newnodevarnodearray{Xn .  dummynode)  Xlink. 
p{record  =  Xz.  Xy.  link  z  :=  y, 
forward  =  An.  Xd. 

letrec  scan:  node  comm 
where  scan  =  Xn.  newnodevar  n  An. 

if  eqnode  n  x  then  d  x  else  [scan(link  n)  ]  d  n) 
in  scan  n, 

backward  =  Xn.  Xd.  newnodevar  n  Xn. 

(while  eqnode  n  x  do  [dn  ■,n  :=  link  n)  ;  d 

Here  eqnode  is  a  primitive  operation  for  comparing  nodes.  The  initial  parameter  x  of  new¬ 
pathsvar  is  the  node  from  which  all  the  paths  emanate. 

Finally,  we  must  define  the  data  type  node.  Forsythe  lacks  facilities  for  defining  new 
data  types,  but  the  effect  of  a  new  data  type  can  be  obtained  by  defining  the  relevant  phrase 
types,  primitive  operations,  and  declarators.  This  is  easy  if  we  use  a  trivial  representation, 
where  a  node  is  represented  by  an  integer  n  such  that  0  <  n  <  A^: 

lettype  node  =  int, 
nodeacc  =  intacc, 
nodevar  =  intvar 

in 

letinline  dummynode  =  —1, 

eqnode  =  Am:  node.  An:  node,  m  =  n, 
newnodevar  =  newintvar , 
newnodevarseq  =  newintvarseq, 
newboolvarnodearray  =  newboolvarseq  N, 
newintvarnodearray  =  newintvarseq  N, 
newnodevarnodearray  =  newintvarseq  N 

Unfortunately,  this  way  of  defining  node  is  limited  by  the  fact  that  lettype  definitions  are 
transparent  rather  than  opaque.  Thus  typechecking  would  not  detect  an  erroneous  operation 
that  treated  nodes  as  integers. 
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To  avoid  this  difficulty,  one  would  like  to  have  opaque  type  definitions  in  Forsythe.  How¬ 
ever,  even  in  the  absence  of  opaque  definitions,  one  can  still  achieve  a  degree  of  data  abstrac¬ 
tion  by  defining  nodes  to  be  one-field  objects  containing  integers,  rather  than  “raw”  integers. 
This  approach  assures  that  the  integrity  of  the  abstraction  node  will  not  be  violated  by  the 
reachability  program  providing  this  program  does  not  contain  any  occurrence  of  the  field 
name  used  in  the  definition  of  node. 

This  approach  is  embodied  in  the  following  definitions,  in  which  nn  is  used  as  the  “secret” 
field  name: 

lettype  node  =  (nn:int)  in 
lettype  nodeacc  =  node  — >  comm  in 
lettype  nodevar  =  (node  &  nodeacc)  in 
lettype  nodevarseq  =  (int  — >  nodevar  &  /en:int), 
boolvarnodearray  =  node  -4  boolvar, 
int varno dearray  =  node  ^  intvar, 
nodevarnodearray  =  node  ^  nodevar 


letinline  dummynode  =  {nn  =  —1), 

eqnode:  node  — ^  node  ^  bool  =  Am.  An.  m.nn  =  n.nn, 
newnodevar:  node  -4  (nodevar  comm)  -4  comm  = 

Xinit.  Xb.  newintvar  {init.nn)  Xx.  b{nn  =  x.  Am.  x  :=  m.nn), 
newnodevarseq:  int  —>  (int  -4  node)  — >  (nodevarseq  —>  comm) 
XL  Xinit.  Xb.  newintvarseq  I  (^Xk.  {init  k).nn^  Xx. 
b(^Xk.  {nn  =  x  k,  Am.  x  k  :=  m.nn),  len  =  x.len^ 


comm  = 


letinline  newboolvarnodearray:  (node  —>  bool)  — > 

(boolvarnodearray  — comm)  comm  = 

Xinit.  Xb.  newboolvarseq  N  (^Xk.  init{nn  =  k)^  Xx.  b(^Xn.  x{n.nn)'^, 
newintvarnodearray:  (node  — >  int)  — > 

(intvarnodearray  comm)  —>  comm  = 

Xinit.  Xb.  newintvarseq  N  {Xk.  init{nn  =  fc)j  Ax.  b{Xn.  x{n.nn)^, 
newnodevarnodearray:  (node  — >  node)  —>■ 

(nodevarnodearray  comm)  comm  = 

Xinit.  Xb.  newnodevarseq  N  (^Xk.  init{nn  =  k)^  Xx.  b(^Xn.  x{n.nn)'j 
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This  approach  to  defining  a  new  data  type  by  defining  the  relevant  phrase  types  can  be 
used  to  provide  more  interesting  representations.  As  a  final  example,  we  define  complex 
numbers: 


lettype  complex  =  (r:  real  Sz  i:  real)  in 
lettype  complexacc  =  complex  — >  comm  in 
lettype  complexvar  =  [complex  Sz  complexacc)  in 

letinline  newcomplexvar:  complex  — >  [complexvar  — »■  comm)  — >  comm  = 
Xinit.  Xbody.  newrealvar  init.r  Xrpart.  newrealvar  init.i  Xipart. 
body(r  =  rpart,i  =  ipart, 

Ac.  newrealvar  c.r  Xt.  [ipart  :=  c.i ;  rpart  t)^  . 


Here  newcomplexvar  is  a  declarator  that  represents  a  complex  variable  by  two  real  variables 
rpart  and  ipart.  The  last  line  defines  an  acceptor  that  sets  the  representation  variables. 
(Note  the  use  of  the  temporary  t  to  insure  that  both  parts  of  the  complex  argument  c  are 
evaluated  before  either  representation  variable  is  set.) 


Of  course,  one  would  like  to  make  real  numbers  a  subtype  of  complex  numbers,  with 
an  implicit  conversion  that  sets  ipart  to  zero.  In  Forsythe,  however,  this  is  not  permitted. 
Unfortunately,  there  seems  to  be  no  way  to  allow  such  user-defined  implicit  conversions  while 
enforcing  the  relationships  between  conversions  and  procedures  with  multiple  types  that  are 
described  in  Section  4. 


14.  Other  Publications  Related  to  Forsythe 

The  genesis  of  Forsythe  lies  in  the  author’s  general  viewpoint  about  Algol-like  languages  [5], 
and  especially  in  the  functor-category  semantics  of  such  languages  developed  by  F.  Oles  [16, 
17].  The  language  was  first  described  in  a  preliminary  report  [1],  of  which  this  document  is 
a  substantial  revision.  (The  most  important  change  in  the  language  is  that  the  requirements 
for  explicit  type  information  have  been  made  more  flexible.) 

There  are  several  technical  problems  associated  with  intersection  types.  The  semantics  of 
intersection  as  a  pullback  is  not  syntax-directed,  since  the  meaning  of  9i  h  62  depends  upon 
then  meaning  of  61  U  62  as  well  as  of  61  and  oi  92.  A  demonstration  that  the  semantics  is  still 
well-defined  was  given  in  an  invited  talk  at  the  Logic  in  Computer  Science  Symposium  [19], 
but  was  never  written  up.  A  proof  that  the  semantics  is  coherent,  i.e.  that  different  proofs 
of  the  same  typing  do  not  lead  to  different  meanings,  was  given  in  [20]. 

The  functor-category  semantics  is  the  basis  of  a  scheme  for  generating  intermediate  code 
for  Algol- like  language  [21]. 
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15.  Conclusions  and  Future  Research 


There  are  a  number  of  directions  in  which  it  would  be  desirable  to  extend  Forsythe,  providing 
such  extensions  do  not  impact  the  uniformity  of  the  language.  We  are  currently  investigating, 
or  hope  to  investigate,  the  following  possibilities: 

•  Sums  or  disjunctions  of  phrase  types.  Unfortunately,  sums  of  phrase  types  interact 
with  the  conditional  construction  in  a  counterintuitive  manner.  Suppose,  for  example, 
that  comm  +  int  is  the  binary  sum  of  comm  and  int,  with  injection  operations  in\ 
of  type  comm  (comm  +  int)  and  in^  of  type  int  — >  (comm  +  int),  and  a  case 
operation  0  such  that,  for  any  phrase  type  0,  if  pi  has  type  comm  — >  9  and  p2  has 
type  int  — >  9  then  p\  0  p2  has  type  (comm  +  int)  — >  0,  with  the  reduction  rule 

{p\  ®  P'i){ini  x)  piX. 

If  application  of  the  conditional  construction  to  sum  types  is  to  make  sense,  then  the 
reduction 


(pi  0  P2)(if  ^then  mi  celse  m2  e)  if  tthenpi  celsep2  e 

must  hold.  Then,  if  the  language  is  to  exhibit  reasonably  uniform  behavior,  the  similar 
reduction 


(pi  0  P2)(if  ithen  mi  celse  mi  c')  if  6thenpi  celsepi  c' 

must  also  hold.  But  then,  the  reduction 

mi  (if  6  then  c  else  c')  =>  if  6  then  mi  c  else  mi  c' 
cannot  hold,  for  otherwise  we  would  have  both 

(pi  0  P2)(*^i(if  Athene  else  c'))  pi(if  6then  celse  c') 


and 


(pi  0P2)(*^i  (if  Athene  else  c')^  (pi  0  p2)(if  ^>then  mi  celse  mi  c') 

if  6  then  Pi  celsepi  c' , 

which  reduce  the  same  phrase  to  two  phrases  that  will  have  different  meanings  if  the 
procedure  pi  changes  the  value  of  b  before  executing  its  parameter.  In  particular, 
the  falsity  of  the  reduction  rule  for  injections  and  conditionals  implies  that  injections 
cannot  be  treated  as  implicit  conversions. 

•  Polymorphic  or  universally  quantified  phrase  types  [22],  possibly  with  bounded  quan¬ 
tification  in  the  sense  of  [23].  In  addition  to  providing  polymorphic  procedures,  this 
extension  would  also  provide  opaque  type  definitions.  This  kind  of  extension  has  been 


51 


investigated  by  B.  C.  Pierce  [18,  24],  who  found  that  there  is  no  decision  procedure  for 
type-checking  bounded  quantification,  even  in  the  absence  of  intersection  types.  On 
the  other  hand,  Pierce  gave  a  practical  algorithm  for  type-checking  the  combination 
of  bounded  quantification  and  intersection  types  that  seems  to  terminate  in  cases  of 
practical  interest.  He  also  gave  a  number  of  interesting  examples  of  the  descriptive 
power  of  such  a  type  system. 

•  Recursively  defined  phrase  types. 

•  Enriched  data  types.  Although  the  data  types  of  Algol  (and  so  far  of  Forsythe)  are 
limited  to  primitive,  unstructured  types,  there  would  be  no  inconsistency  in  providing 
a  much  richer  variety  of  data  types.  The  real  question  is  which  of  the  many  possible 
enrichments  would  provide  additional  expressive  power  without  degrading  efficiency  of 
execution. 

•  Coroutines. 

•  Alternative  treatment  of  arrays.  Array  facilities  along  the  lines  of  those  described  in 
[25]  would  serve  to  avoid  spurious  array  initializations,  such  as  the  initialization  of  w 
in  newdoubleset  in  Section  13.  But  it  is  not  clear  how  this  approach  can  be  extended 
to  encompass  multidimensional  arrays. 

On  the  other  hand,  there  is  also  a  direction  in  which  it  might  be  fruitful  to  restrict 
Forsythe:  to  impose  syntactic  restrictions  so  that  one  can  determine  syntactically  (in  a  fail¬ 
safe  manner)  when  phrases  cannot  interfere  with  one  another.  (Two  phrases  interfere  if 
their  concurrent  execution  is  indeterminate.  For  example,  aliased  variables  interfere,  as  do 
procedures  that  assign  to  the  same  global  variables.)  Such  a  restriction  would  open  the  door 
for  the  concurrent,  yet  determinate,  execution  of  noninterfering  commands,  as  well  as  for  a 
form  of  block  expression  (in  the  sense  of  Algol  W)  that  is  restricted  to  avoid  side  effects. 

Nearly  two  decades  ago,  I  wrote  a  paper  [26]  proposing  a  scheme  for  restricting  Algol-like 
languages  for  this  purpose.  At  the  time,  certain  syntactic  anomalies  (described  in  the  final 
section  of  [26])  discouraged  me  from  pursuing  the  matter  further.  But  it  is  now  clear  that 
these  anomalies  can  be  avoided  [27,  28].  Moreover,  it  appears  that  this  approach  does  not 
raise  insuperable  type-checking  complications. 

However,  the  syntactic  discipline  described  in  these  papers  would  still  restrict  Forsythe 
uncomfortably.  For  example,  one  could  not  regard  a  :=  e  as  a  e  when  a  and  e  interfere,  nor 
could  one  write  newintvar  init  b  when  init  and  b  interfere.  There  are  also  problems  with 
recursive  procedures  that  assign  to  global  variables,  and  the  use  of  completions  is  precluded. 
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APPENDICES 


A  Lexical  Structure 

A  Forsythe  program,  in  the  form  actually  read  by  a  computer,  is  an  ASCII  character  string 
that  is  interpreted  as  a  sequence  of  lexemes  and  separators.  (After  eliminating  the  separators, 
the  sequence  of  lexemes  is  interpreted  according  to  the  concrete  syntax  given  in  Appendix  B.) 

A  lexeme  is  one  of  the  following: 

•  A  keyword,  which  is  one  of  the  following  character  strings: 


and 

begin 

div 

do 

else 

end 

error 

if 

iff 

implies 

in 

let 

letinline 

letrec 

lettype 

loop 

or 

rec 

rem 

seq 

then 

where 

while 

•  An  identifier,  i.e.  (id),  which  is  a  sequence  of  one  or  more  letters,  digits,  and  underscore 
symbols  that  begins  with  a  letter  and  is  not  a  keyword. 

In  this  report,  keywords,  and  also  identifiers  that  are  used  as  type  identifiers,  are 
typeset  in  boldface,  but  no  such  font  distinction  is  made  in  the  language  actually  read 
by  the  computer. 

•  A  natural-number  constant,  i.e.  (nat  const),  which  is  a  sequence  of  one  or  more  digits. 

•  A  real-number  constant,  i.e.  (real  const),  which  is  a  sequence  of  digits  and  decimal 
points  begining  with  a  digit  and  containing  exactly  one  decimal  point;  this  sequence 
may  optionally  be  followed  by  a  scale  factor,  which  consists  of  the  letter  E  or  e,  an 
optional  +  or  -  sign,  and  a  natural-number  constant. 

•  A  character  constant,  i.e.  (char  const),  which  is  the  character  #,  followed  by  a  character 
item,  which  is  one  of  the  following: 

-  a  character  other  than  the  backslash  \,  the  newline  symbol,  or  the  tab  symbol, 

-  \\,  denoting  the  backslash, 

-  \n,  denoting  a  newline  symbol, 

-  \t,  denoting  a  tab  symbol, 

-  \ ' ,  denoting  the  quotation  mark  ' , 

-  a  backslash,  followed  by  three  digits  (which  are  interpreted  as  an  ASCII  code  in 
octal  representation). 
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•  A  string  constant,  i.e.  (string  const),  which  is  a  sequence  of  zero  or  more  string  items, 
enclosed  in  the  single  quotation  marks  '  and  %  where  a  string  item  is  one  of  the 
following: 


—  a  character  other  than  the  backslash  \,  the  newline  symbol,  the  tab  symbol,  or 
the  quotation  mark  ’ , 

—  \\,  denoting  the  backslash, 

—  \n,  denoting  a  newline  symbol, 

—  \t,  denoting  a  tab  symbol, 

—  \ ,  denoting  the  quotation  mark  ’ , 

—  a  backslash,  followed  by  three  digits  (which  are  interpreted  as  an  ASCII  code  in 
octal  representation), 

—  the  backslash,  followed  by  a  blank,  tab,  or  newline,  followed  by  a  sequence  of  zero 
or  more  characters  other  than  a  backslash,  followed  by  a  backslash. 

The  last  form  of  string  item  has  no  effect  on  the  meaning  of  the  string  constant.  It  is 
included  to  allow  such  constants  to  run  over  more  than  one  line  of  a  Forsythe  program. 

•  A  special  symbol,  which  is  one  of  the  following  characters  or  strings: 

(  )  :  & 

+  -  =  < 

;  -> 

<=  >=  :  = 

A  separator  is  either  a  blank,  a  tab,  a  newline,  or  a  comment,  where  a  comment  is  either: 

•  a  sequence  of  characters  enclosed  in  the  braces  {  and  }.  If  the  braces  {  and  }  occur 
within  the  sequence  of  characters,  they  must  be  balanced. 

•  a  percent  sign  %  followed  by  a  sequence  of  characters  not  containing  a  newline,  followed 
by  a  newline. 

Except  that  they  separate  lexemes  that  would  otherwise  combine  into  a  single  lexeme 
(when,  for  example,  an  identifier  is  followed  by  another  identifier  or  a  natural-number  con¬ 
stant),  separators  have  no  effect  on  the  meaning  or  translation  of  a  program.  More  precisely, 
the  program  is  interpreted  by  scanning  the  input  characters  from  left  to  right,  repeatedly  re¬ 
moving  the  longest  string  that  is  a  lexeme  or  separator,  and  then  eliminating  the  separators 
from  the  resulting  sequence. 


/ 

> 
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A  number  of  the  symbols  used  in  this  report  (including  the  concrete  syntax  given  in 
Appendix  B)  are  not  available  in  ASCII.  They  must  be  translated  into  lexemes  as  follows: 


publication 

ascii 

publication 

ascii 

-> 

< 

<= 

= 

== 

> 

>= 

A 

\ 

- 

T 

A 

and 

X 

* 

V 

or 

div 

implies 

“= 

iff 
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B  Concrete  Syntax 


We  will  specify  the  concrete  syntax  of  Forsythe  by  giving  a  context-free  grammar  that  defines 
“parsable”  programs  as  sequences  of  lexemes.  (The  conversion  of  a  string  of  input  characters 
into  a  sequence  of  lexemes,  and  also  the  transliteration  of  lexemes  needed  to  fit  the  constraint 
of  the  ASCII  character  set,  are  defined  in  Appendix  A.)  Notice  that  a  parsable  program  is 
not  necessarily  typable;  typing  is  specified  by  the  inference  rules  given  previously  and  is 
independent  of  the  concrete  syntax. 

The  one  novelty  of  this  syntax  is  its  treatment  of  “heavy  prefixes”  such  as  the  conditional 
phrase.  Such  phrases  are  permitted  to  follow  operators  even  when  those  operators  have  high 
precedence.  For  example  one  can  write 

A  X  if  B  then  C  else  D  +  E 

instead  of 

A  X  (if  B  then  C  else  D  +  E) . 

To  illustrate  the  treatment  of  heavy  prefixes,  consider  augmenting  a  simple  language  of 
arithmetic  expressions, 

(factor)  ::=  (id)  |  ((expression)) 

(term)  ::=  (factor)  |  (term)  x  (factor) 

(expression)  (term)  |  (term)  -t-  (expression) 

with  a  conditional  expression,  treated  as  a  heavy  prefix.  The  resulting  grammar  is: 

(factor)  ::=  (id)  |  ((general  expression)) 

(heavy  factor)  ::=  if  (general  expression)  then(general  expression)  else 
(general  expression) 

(term)  ::=  (factor)  |  (term)  x  (factor) 

(heavy  term)  ::=  (heavy  factor)  |  (term)  x  (heavy  factor) 

(expression)  ::=  (term)  |  (term)  +  (expression) 

(heavy  expression)  ::=  (heavy  term)  |  (term)  -|-  (heavy  expression) 

(general  expression)  ::=  (expression)  |  (heavy  expression) 

In  this  simple  example,  a  phrase  beginning  with  a  heavy  prefix  will  extend  to  the  next 
right  parenthesis  (or  to  the  end  of  the  text).  In  the  actual  syntax  of  Forsythe,  such  phrases 
extend  to  the  next  semicolon,  comma,  right  parenthesis,  or  end. 

The  grammar  of  Forsythe  is  given  by  the  following  productions,  in  which  we  use  (type  n) 
to  denote  type  expressions,  (p  n)  to  denote  phrases,  and  (hp  n)  to  denote  heavy  phrases. 
The  integer  n  indicates  the  precedence  level  (with  small  n  for  high  precedence).  The  symbols 
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(id),  (nat  const),  (real  const),  (char  const),  and  (string  const)  denote  the  lexeme  classes  for 
identifiers  and  various  kinds  of  constants,  as  defined  in  Appendix  A. 

(program)  ::=  (p  16) 

(type  id)  ::=  (id) 

(id  list)  ::=  (id)  |  (id),  (id  list) 

(type  0)  ::=  (type  id)  |  ((type  3))  |  begin(type  3)  end 
(type  1)  ::=  (type  0)  |  (type  0)  ^  (type  1) 

(type  2)  ::=  (type  1)  |  (id  list)  :  (type  2) 

(type  3)  ::=  (type  2)  |  (type  2)  &  (type  3) 

(alt  type)  ::=  (type  1)  |  (type  1)  I  (alt  type) 

(let  list)  ::=  (id)  =  (p  15)  |  (id)  :  (type  1)  =  (p  15) 

I  (id)  =  (p  15),  (let  list)  |  (id)  :  (type  1)  =  (p  15),  (let  list) 

(letrec  list)  (id  list)  :  (type  1)  |  (id  list)  :  (type  1),  (letrec  list) 

(where  list)  ::=  (id)  =  (p  15)  |  (id)  s  (p  15),  (where  list) 

(lettype  list)  ::=  (type  id)  =  (alt  type)  |  (type  id)  =  (alt  type),  (lettype  list) 

(seq  list)  ::=  (p  15)  |  (p  15),  (seq  list) 

(p  0)  (id)  |  (nat  const)  |  (real  const)  |  (char  const)  |  (string  const) 

I  ((p  16))  I  begin(p  16)  end 
(hp  0)  ::=if(p  16)then(p  16)  else(p  13) 

I  while(p  16)  do(p  13) 

I  loop(p  13) 

I  A(id  list)  :  (alt  type),  (p  13) 

I  A(id  list),  (p  13) 

I  rec  :  (type  1).  (p  13) 

I  let(let  list)  in(p  13)  |  letinline(let  list)  in(p  13) 

I  letrec(letrec  list)  where(where  list)  in(p  13) 

I  lettype(lettype  list)  in(p  13) 

(p  1)  ::=  (p  0)  I  (p  l).(id) 

(hp  1)  ::=  (hp  0) 

(p  2)  (p  1)  I  (p  2)(p  1)  |  error(p  1) 

I  seq((seq  list))  |  seq  begin(seq  list)  end 
(hp  2)  ::=  (hp  1)  |  (p  2)(hp  1)  |  error(hp  1) 
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(P  3)  ::=  (p  2)  |  (p  3) (exp  op)(p  2) 

(hp  3)  ::=  (hp  2)  [  (p  3) (exp  op)(hp  2} 

(exp  op)  ::=  1 1  ** 

(P  4)  (p  3)  |  (p  4) (mult  op)(p  3) 

(hp  4)  ::=  (hp  3)  |  (p  4) (mult  op)(hp  3) 

(mult  op)  ::=  X  I  /  I  I  rem 

(P  5)  (p  4)  |  (add  op)(p  4)  |  (p  5)(add  op)(p  4) 

(hp  5)  ::=  (hp  4)  |  (add  op)(hp  4)  |  (p  5)(add  op)(hp  4) 
(add  op)  ::=  +  |  — 


(p  6)  (p  5)  |  (p  6)(rel  op)(p  5) 

(hp  6)  ::=  (hp  5)  |  (p  6)(rel  op)(hp  5) 
(rel  op)  ::=  =  I  ^  I  <  I  <  I  >  I  > 


(p  7)  (p  6)  |  -(p  6) 

(hp  7)  (hp  6)  |  -(hp  6) 

(p  8)  (p  7)  |  (p  8)  A  (p  7) 

(hp  8)  (hp  7)  |  (p  8)  A  (hp  7) 

(p  9)  (p  8)  1  (p  9)  V  (p  8) 

(hp  9)  ::=  (hp  8)  |  (p  9)  V  (hp  8) 

(p  10)  (p  9)  I  (p  10)  ^  (p  9) 

(hp  10)  ::=  (hp  9)  |  (p  10)  ^  (hp  9) 

(p  11)  ::=  (p  10)  I  (p  11)  (p  10) 

(hp  11)  ::=  (hp  10)  |  (p  11)  (hp  10) 

(p  12)  ::=  (p  11)  I  (p  12)  :=  (p  11) 

(hp  12)  ::=  (hp  11)  |  (p  12)  :=  (hp  11) 

(p  13)  ::=  (p  12)  |  (hp  12) 

(P  14)  ::=  (p  13)  |  (p  14)  ;  (p  13) 

(P  15)  ::=  (p  14)  |  (id)  =  (p  15) 

(P  16)  ::=  (p  15)  |  (p  16),  (id)  =  (p  15) 

I  (p  16),A(id  list)  :  (alt  type),  (p  13) 
I  (p  16),A(id  list),  (p  13) 
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C  Type  Checking 


It  is  well-known  that  there  is  no  algorithm  for  the  complete  inference  of  intersection  types 
[13,  14,  15].  Thus,  Forsythe  must  require  the  programmer  to  provide  a  degree  of  explicit 
type  information.  We  have  attempted  to  make  this  requirement  as  flexible  as  possible;  as  a 
consequence,  a  precise  description  of  where  explicit  types  must  occur  is  rather  complicated. 

The  following  is  a  grammatical  schema  (i.e.  a  van  Wijngaarden  grammar)  for  the  abstract 
syntax  of  a  sublanguage  of  the  language  described  earlier,  such  that  every  program  in  the 
sublanguage  contains  enough  type  information  to  be  typechecked  (even  though  the  program 
may  not  be  type-correct).  The  converse  is  nearly  true,  though  (using  phrases  of  type  ns) 
one  can  contrive  programs  that  typecheck  even  though  they  do  not  satisfy  this  schema. 

The  nonterminal  symbols  (p,^)  and  (seq  list„)  are  indexed  by  nonnegative  integers  and 
infinity,  with  oo  ±  1  =  oo  and  0  —  1  =  0.  It  is  assnmed  that  syntactic  sngar  (excepting  defini¬ 
tional  forms)  has  been  eliminated  as  in  Section  7,  and  that  (type),  (alt  type),  (lettype  list), 
and  (letrec  list)  are  defined  as  in  Appendix  B. 

(program)  ::=  (po) 

(unary  op)  ::=  +  |  —  |  ~ 

(binary  op)  ::=  f  |  **  |  x  |  /  |  4-  |  rem  |-|-|-|  =  |^|<|<|>|>|A|V|=^|44-|; 

(let  list)  ::=  (id)  =  (p^)  |  (id)  :  (type)  =  (po) 

I  (itl)  =  (Poo)?  (let  list)  I  (id)  :  (type)  =  (po),  (let  list) 

(where  list)  (id)  =  (pg)  |  (id)  =  (pg),  (where  list) 

(seq  list„)  ::=  (p„_i)  |  (p„_i),  (seq  list„) 

(Pn)  ••=  (I'i)  I  (ii^t  const)  I  (real  const)  |  (char  const)  j  (string  const) 
|if(Po)then(p„)else(p„) 

I  while(pg)  do(pg)  |  loop(pg) 

I  A  (id)  :  (alt  type).  (p„_i)  |  (p„),A(id)  :  (alt  type).  (p„_i) 

|rec  :  (type),  (pg) 

I  let(let  list)  in(p,j)  |  letinline(let  list)  in(p,j) 

I  letrec(letrec  list)  where  (where  list)  in(p,j) 

I  lettype  (lettype  list)  in(p„) 

I  (Pn)-(id) 

I  (Pn+i)(Po) 

I  seq((seq  list„)) 

I  (unary  op)(pg)  |  (pg)  (binary  op)(pg) 

I  (Id)  =  (p„)  I  (p„),(id)  =  (p„) 

(Po)  "=  ^(id).  (Po)  I  (Po).^(id)-  (Po)  I  error(pg) 
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When  the  typechecker  examines  a  phrase  occurrence  described  by  the  nonterminal  (p„), 
it  is  given  a  goal  describing  a  set  of  potential  simple  types  of  the  phrase  that  are  relevant  to 
the  typing  of  the  enclosing  program.  When  n  =  0  this  set  is  finite;  when  n  =  oo  it  is  the  set 
of  all  simple  types.  Very  roughly  speaking,  when  n  is  nonzero  and  finite,  it  describes  a  set 
of  procedural  types  whose  first  n  arguments  are  arbitrary.  The  final  production  displayed 
above  shows  that  certain  phrases,  especially  abstractions  without  type  information,  are  only 
permitted  in  contexts  where  the  goal  describes  a  finite  set. 

To  make  this  sketchy  description  more  precise,  we  first  define  a  simple  type  to  be  a  type 
with  no  occurrence  of  &  except  on  the  left  of  one  or  more  arrows.  More  formally, 

ijj  v.—  p\d  ^  iji!\i:u!  (simple  types) 

To  within  equivalence,  every  type  is  an  intersection  of  simple  types.  To  express  this  fact, 
we  define  the  function  s,  which  maps  types  into  finite  sets  of  simple  types,  as  follows: 

s  p={p] 

s{6  6')  =  {0  ^  U3  \  Lo  e  s  6'  ^ 

s[i:  0)  =  {t:a;|a;6s0} 
s  ns  =  {} 

S  {6\  &  ^2)  =  5  ^1  U  5  ^2  5 

and  we  define  the  function  &,  which  maps  finite  sets  of  types  into  types,  by 

&{}  =  ns 

k{e}  =  e 

k{6i,...,0n,On+i}  =  0ik  (&:{^2,---,^n+i})  when  n>l. 

(Strictly  speaking,  this  definition  only  makes  sense  if  one  imposes  some  ordering  on  the  types 
61,  ...  ,  On+i-  But  this  ordering  can  be  arbitrary,  since  &  is  commutative  with  respect  to 
the  equivalence  of  types.) 

It  is  easy  to  see,  by  induction  on  the  structure  of  simple  types,  that  s  tu  =  {a;}  for  any 
(jj.  Moreover,  it  can  be  shown  that,  for  any  type  6  and  any  set  a  of  simple  types, 

k{s  0)  ~  0  and  s(&  a)  =  a  . 

It  can  also  be  shown  that 

6  <9'  and  only  if  (Vo;'  e  s  9'){3uj  e  s  9)  u;  <  u)' , 
and  that  a;  <  u;'  if  and  only  if  one  of  the  following  conditions  holds: 
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1.  There  are  primitive  types  p  and  p'  such  that 

uj  —  p  and  oj'  =  p'  and  p  <prim  p'  • 

2.  There  are  types  6i  and  9\  and  simple  types  u)2  and  W2  such  that 

ijj  =  6\  ^  UI2  and  u)'  —  6[  ^  LO2  and  9[  <  9i  and  UJ2  ^  1^2  ■ 

3.  There  are  an  identifier  l  and  simple  types  uj-i  and  ll![  such  that 

u  =  L'.ui  and  u  =  i:  uj[  and  . 

These  properties  lead  directly  to  an  algorithm  for  computing  the  predicate  9  <  9'. 

As  remarked  earlier,  a  goal  is  an  entity  denoting  a  set  of  simple  types.  The  simplest  kind 
of  goal  is  a  type  9,  which  denotes  the  set  s  9.  But  we  also  need  goals  that  denote  certain 
infinite  sets.  Thus  we  define 


7  ::=  0  I  T  I  l>7|i:7  (goals) 

and  we  extend  the  function  a  to  map  the  new  goals  into  the  sets  they  represent: 

sT  =  S 

s{  X>‘^)  =  {9^uj\9€T  and  a;  €  5  7  } 
s(<.:7)  =  {<.:c<;|a;es7}, 

where  S  denotes  the  set  of  all  simple  types  and  T  denotes  the  set  of  all  types. 

Finally,  we  define  the  typechecking  function,  tc,  which  maps  a  type  assignment,  phrase, 
and  goal  into  a  type.  Within  equivalence, 

tc(7r,  p,  7)  ~  {  a;  |  a;  e  s  7  and  tt  h  p:  a;  }  . 

Thus  tc(7r,p,  7)  will  be  a  greatest  lower  bound  of  the  set  of  simple  types  uj  that  belong 
to  the  set  denoted  by  the  goal  7  and  also  satisfy  the  typing  tt  h  p:u.  When  7  =  T 
there  is  no  contextual  information,  corresponding  to  bottom-up  typechecking.  At  the  other 
extreme,  top-down  checking  is  also  encompassed:  The  typing  tt  h  p:  0  is  valid  if  and  only 
if  tc('K,p,9)  <  9.  (In  fact,  this  subtype  relation  will  hold  if  and  only  if  the  equivalence 
tc{Tr,p,9)  ~  9  holds,  since  the  opposite  subtyping  9  <  tc{Tr,p,9)  will  always  hold.) 

Now  we  can  give  a  precise  description  of  the  indexing  in  the  abstract  grammar  at  the 
beginning  of  this  appendix:  The  nonterminal  (Poo)  describes  those  occurrences  of  phrases 
that  will  be  typechecked  with  a  goal  containing  T,  while  the  nonterminal  (p„)  describes 
those  occurrences  that  will  be  typechecked  with  a  goal  that  does  not  contain  T,  but  contains 
n  occurrences  of  t>. 
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It  is  important  to  realize  that,  even  though  some  goals  are  also  types,  goals  play  a  different 
role  than  types,  and  are  therefore  a  different  kind  of  entity.  Specifically,  the  equivalence 
relation  on  types  is  inappropriate  for  goals,  since  the  function  tc  does  not  map  equivalent 
goals  into  equivalent  types.  For  instance,  int  ~  int&real,  but  (for  any  type  assignment  tt), 

tc(7r,0.5,int)  =  ns  and  tc(7r,  0.5,  int  &  real)  =  real , 

which  are  not  equivalent  types. 

The  solution  to  this  problem  is  to  adopt  a  different  equivalence  relation  =  for  goals: 

7  =  7'  iff  (Vo;  e  s  e  s  'f')  ld  ^  cv'  and  (Vo;'  e  s  7^)(3a;  e  5  7)  a;  ~  a;' . 

For  this  relation,  one  can  show  that,  if  7  =  7'  then  tc{Tr^p,'y)  ~ 

Under  certain  circumstances,  the  typechecker  can  require  time  that  is  exponential  in  the 
length  of  its  input.  This  can  happen  because  a  single  call  tc(7r,p,  7)  can  cause  more  than 
one  recursive  call  for  the  same  subphrase  of  p  under  any  of  the  following  circumstances: 

1.  p  is  a  lettype  declaration  containing  an  alternative  type  construction  with  several 
alternatives, 

2.  p  is  an  explicitly  typed  abstraction  containing  an  alternative  type  construction  with 
several  alternatives, 

3.  p  is  an  implicitly  typed  abstraction  and  7  is  an  intersection  of  several  procedural  types. 

One  can  expect  the  programmer  to  be  aware  of  what  is  happening  in  the  first  two  cases,  since 
the  multiple  alternatives  would  occur  explicitly  in  the  program.  The  last  case,  however,  can 
be  more  insideous  and  subtle.  For  instance,  consider  the  call 

ic(7r,  let  c  =  newintvar  0  Xx.  B  in  -  •  ■ ,  comm)  . 

Since  c  is  not  explicitly  typed,  this  leads  to  the  call 

tc{'7r,  newintvar  0  Xx.  5,  T)  . 

In  turn,  assuming  that  newintvar  0  has  the  type 

(intvar  ^  comm)  comm&(intvar  ^  compl)  ^  compl&(intvar  -4  int)  ->  int& 
(intvar  ^  real)  — >  real  &: (intvar  — ^  bool)  bool&(intvar  — char)  — >  char  , 

this  leads  to  the  call 

tc{Tr,  Xx.  B,  intvar  comm  k  intvar  ^  compl  k  intvar  ^  int  k 
intvar  real  k  intvar  — >  bool  k  intvar  ^  char) . 
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Naively,  one  would  expect  this  to  leads  to  six  calls  that  all  typecheck  the  same  abstraction 
body: 

tc([7r  I  x: intvar], comm)  tc{[Tr  |  x:intvar],B,compl)  tc([7r  |  x: intvar], int) 
tc([7r  I  x: intvar], jB, real)  tc([7r  |  x:intvar],5, bool)  tc{['K  |  x: intvar ], char) . 

But  in  fact,  the  typechecker  will  take  advantage  of  the  equivalence  =  for  goals  to  replace  the 
goal 

intvar  — +  comm  &  intvar  — compl  &  intvar  int  &: 
intvar  — >  real  &  intvar  ^  bool  &  intvar  — >  char 

by  the  equivalent  goal 

intvar  (comm  &:  compl  &:  int  &  real  &:  bool  &  char)  , 
which  leads  to  the  single  call 

tc([Tr  I  x:  intvar],  5,  comm  &  comply  int  &  real  &:  bool  &:  char)  . 

Although  a  full  discussion  of  the  subject  is  beyond  the  scope  of  this  report,  this  example 
illustrates  how  a  careful  choice  of  canonical  forms  for  types  and  goals  can  enhance  the 
efficiency  of  typechecking.  Among  the  programs  in  this  report,  the  only  implicitly  typed 
abstractions  whose  goals  necessitate  checking  their  body  more  than  one  are: 

1.  In  Section  10,  the  binding  of  fin  in  the  final  definition  of  newintvarres, 

2.  In  Section  11,  the  bindings  of  b  in  the  initial  definition  of  newintvararray  and  the 
definition  of  newtrivararray, 

3.  In  Section  11,  the  bindings  of  X  in  the  definitions  of  slice  and  slicecheck. 

Further  experience  will  be  needed,  however,  before  we  can  be  confident  that  our  type- 
checker  is  reasonably  efficient  in  practice.  As  an  illustration  of  how  close  we  are  to  the  edge 
of  disaster,  notice  that  we  have  avoided  the  temptation  of  giving  the  declarator  newintvar  0 
the  type 

(intvar  ^  comm)  — >  comm&:(intvar  — >  compl)  — >  compl &(int  ^  int)  ^  int& 

(int  ^  real)  real&:(int  ^  bool)  — >  bool&;(int  ^  char)  char  , 

which  makes  explicit  the  fact  that  local  integer  variables  cannot  be  assigned  within  expres¬ 
sions.  With  this  choice  of  type,  the  typechecking  call 

tc(7r,  newintvar  0  Ax.  B,  T) 

would  lead  to  two  calls  for  B: 

tc([7r  I  x:  intvar  ],jB,  comm  &  compl)  tc{[-K  \  x:  int  ],  int  &  real  &  bool  &  char) , 
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so  that  typechecking  would  become  exponential  in  the  number  of  nested  variable  declarations. 

Despite  this  cautionary  example,  we  hope  that  our  typechecker,  perhaps  with  further 
refinements,  will  be  reasonably  efficient  in  normal  practice.  It  should  be  noted,  however, 
that  worst-case  inefficiency  is  inevitable.  In  fact,  it  can  be  shown  that  any  typechecker  for 
Forsythe  (or  any  other  language  using  intersection  types)  is  PSPACE-hard. 

The  proof  is  obtained  by  reducing  the  problem  of  evaluating  quantified  Boolean  expres¬ 
sions,  which  is  known  to  be  PSPACE-complete  [29],  to  the  type  inference  problem.  The 
reduction  is  obtained  by  translating  a  quantified  Boolean  expression  B  into  a  Forsythe 
phrase  B*  as  follows: 

(5i  A  B2)*  =  And  Bl  Bl 

(Bi  V  B2y  =  Or  Bi  b; 

i^B)*  -  NotB* 

i{yx)By  =  Forall  (Ax  :  1 1  f .  B*) 

{i3x)By  =  Exists  (Ax  :  1 1  f .  B*) 

X*  =  X  , 

where  t  and  f  are  distinct  types,  neither  of  which  is  a  subtype  of  the  other,  and  And,  Or,  Not, 
Forall,  and  Exists  are  identifiers  not  occurring  in  the  original  expression.  (For  the  particular 
typechecker  described  in  this  appendix,  one  can  omit  the  alternative  type  expressions  t  If.) 
In  addition  a  truth  value  b  is  translated  into  a  type  b*  by 

true*  =  t  false*  =  f  . 

Let  j3  be  an  assignment  of  truth  values  to  the  free  variables  of  a  quantified  Boolean 
expression,  and  let  tt  be  the  type  assignment  that  maps  each  of  these  variables  x  into  {jdx)*, 
and  maps  the  additional  variables  of  B*  as  follows: 

%{And)  =  (f  — >■  f  ^  f)  &  (f  — >  t  ^  f)  &  (t  — >  f  — >  f)  &  (t  — >  t  ^  t) 

7r((9r)  =  (f  — >  f  ^  f)  &:  (f  — >  t  — >  t)  &  (t  ^  f  — t)  &:  (t  — »■  t  ^  t) 

'K{Not)  =  (f  ^  t)  &;  (t  — >  f) 

TT{Forall)  =  ^(f  — >  f  &:  t  — >■  f)  ^  f  j  &  ^(f  ^  f  &  t  — )■  t)  — >• 

&((f  — >  t  &  t  ^  f)  — >  f  j  &:  ^(f  t  &  t  — >  t)  ^  t) 

Tr{Exists)  =  ((f  — >  f  &  t  ^  f )  — >  f  j  &  ^(f  — >  f  &  t  — >  t)  ^  t) 

&:^(f  — >  t  &  t  ^  f)  ^  &  ^(f  t  &  t  — >  t)  ^  t)  . 

Then  it  is  easy  to  see  that  B  evaluates  to  b  under  the  truth- value  assignment  ^  if  and  only 
if  the  typing  ir  \-  B*  :  b*  is  valid. 
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One  might  object  that  this  reduction  maps  closed  quantified  Boolean  expressions  into 
open  (and  type-open)  phrases  of  Forsythe,  and  thus  might  not  imply  the  inefficiency  of 
typechecking  closed  phrases.  This  objection  can  be  overcome,  however,  by  enclosing  B* 
in  the  following  declarations  (which  are  based  on  the  classical  lamb  da- calculus  encoding  of 
boolean  values  by  the  projections  Ax.  Xy.  x  and  Ax.  Ay.  y): 

lettype  t  =  int  ^  ns  — >  int,  f  =  ns  — +  int  — >  int 
in 

let  And:  (f  ->  f  f)  &  (f  ^  t  ^  f)  &  (t  ->  f  ^  f )  &  (t  ^  t  ^  t)  = 

Xp.  Xq.  Ax.  Xy.p{qxy)y 

Or:  (f  — >  f  ^  f)  &  (f  — >•  t  ^  t)  &:  (t  — >  f  ^  t)  &  (t  — >•  t  — >■  t)  = 

Xp.  Xq.  Ax.  Xy.  p  X  (q  X  y) 

Not:  (f  — >  t)  &  (t  ^  f)  = 

Xp.  Ax.  Xy.  pyx 

in 

let  Forall:  ((f  ^  f  &  t  ^  f )  -^  f)  &  ((f  ^  f  &  t  ^  t)  ^  f) 

&  ((f  — +  t  &:  t  ^  f)  — >  f)  &  ^(f  — )■  t  &  t  — >  t)  — ^  t)  = 

Xh.  And  {h  Ax.  Xy.  x)  {h  Ax.  Xy.  y) 

Exists:  ^(f  — »■  f  &  t  — >  f)  -4  f  j  &  ^(f  f  &  t  — >  t)  — )■  t) 

&:  ((f  ^  t  &  t  ^  f )  — >  t)  &:  ((f  — >  t  &  t  t)  — ^  t)  = 

Xh.  Or  {h  Ax.  Xy.  x)  {h  Ax.  Xy.  y) 

in  •  •  •  . 

To  obtain  completely  explicit  typing,  one  can  annotate  the  abstractions  here  as  follows: 

x,y:  int  I  ns 
p,y:t  If 

/i:f^flt^flf^tlt^t. 

This  makes  it  clear  that  our  lower  bound  applies  even  to  the  typechecking  of  programs  with 
completely  explicit  type  information. 
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